Miner should resume active tenure on restarts #5754

kantai · 2025-01-27T22:00:49Z

Miners should resume their active tenures through a restart. Once #5752 merges, the relayer will spawn a new miner thread on restart, but the miner thread will be somewhat confused -- it will try to continue the existing tenure, but also fail because it expects to create a tenure change payload when it produces the first block in the thread.

There's certainly some relatively straight-forward patches to the checks in the miner thread which would make this possible. However, there's already a fair number of "cases" that would need to each be handled with different checks/patches:

A normal BlockFound tenure.
Resuming after a tenure extend into a new bitcoin block
Resuming after a timeout tenure extend
Resuming after a tenure extend over a failed reorg?

In my opinion, the least cludgey thing to do would be to actually dump some transient state of the active mining thread on exit to a file (somewhat like VRF registrations):

#[derive(Serialize, Deserialize)]
struct BlockMinerSerializableState {
    last_block_mined: Option<NakamotoBlock>,
    mined_blocks: u64,
    burn_election_block: BlockSnapshot,
    burn_block: BlockSnapshot,
    parent_tenure_id: StacksBlockId,
    reason: MinerReason,
    tenure_change_time: Instant,
    burn_tip_at_start: ConsensusHash,
}

The relayer could check burn_tip_at_start against the current burnchain tip, and if it matches, start the mining thread with that information. I think we may need to make sure that the relayer didn't also try to start a new mining thread when it figures out it should be mining at that tip (i.e., don't try to start a new BlockFound thread: just let the resumed thread handle it).

Here's a test for the BlockFound case:

#[test]
#[ignore]
/// Test a scenario in which a miner is restarted right before a tenure
///  which they won. The miner, on restart, should begin mining the new tenure.
fn restarting_miner_mid_tenure() {
    if env::var("BITCOIND_TEST") != Ok("1".into()) {
        return;
    }

    let (mut naka_conf, _miner_account) = naka_neon_integration_conf(None);
    naka_conf.miner.activated_vrf_key_path =
        Some(format!("{}/vrf_key", naka_conf.node.working_dir));
    naka_conf.miner.wait_on_interim_blocks = Duration::from_secs(5);
    let sender_sk = Secp256k1PrivateKey::new();
    // setup sender + recipient for a test stx transfer
    let sender_addr = tests::to_addr(&sender_sk);
    let send_amt = 1000;
    let send_fee = 180;
    naka_conf.add_initial_balance(
        PrincipalData::from(sender_addr).to_string(),
        send_amt * 2 + send_fee,
    );
    let sender_signer_sk = Secp256k1PrivateKey::new();
    let sender_signer_addr = tests::to_addr(&sender_signer_sk);
    let mut signers = TestSigners::new(vec![sender_signer_sk]);
    naka_conf.add_initial_balance(PrincipalData::from(sender_signer_addr).to_string(), 100000);
    let stacker_sk = setup_stacker(&mut naka_conf);
    let http_origin = naka_conf.node.data_url.clone();
    let recipient = PrincipalData::from(StacksAddress::burn_address(false));

    test_observer::spawn();
    test_observer::register_any(&mut naka_conf);

    let mut btcd_controller = BitcoinCoreController::new(naka_conf.clone());
    btcd_controller
        .start_bitcoind()
        .expect("Failed starting bitcoind");
    let mut btc_regtest_controller = BitcoinRegtestController::new(naka_conf.clone(), None);
    btc_regtest_controller.bootstrap_chain(201);

    let mut run_loop = boot_nakamoto::BootRunLoop::new(naka_conf.clone()).unwrap();
    let run_loop_stopper = run_loop.get_termination_switch();
    let Counters {
        blocks_processed,
        naka_submitted_commits: commits_submitted,
        naka_proposed_blocks: proposals_submitted,
        ..
    } = run_loop.counters();
    let coord_channel = run_loop.coordinator_channels();

    let mut run_loop_2 = boot_nakamoto::BootRunLoop::new(naka_conf.clone()).unwrap();
    let _run_loop_2_stopper = run_loop.get_termination_switch();
    let Counters {
        blocks_processed: blocks_processed_2,
        naka_submitted_commits: commits_submitted_2,
        naka_proposed_blocks: proposals_submitted_2,
        ..
    } = run_loop_2.counters();
    let coord_channel_2 = run_loop_2.coordinator_channels();

    let run_loop_thread = thread::spawn(move || run_loop.start(None, 0));
    wait_for_runloop(&blocks_processed);
    boot_to_epoch_3(
        &naka_conf,
        &blocks_processed,
        &[stacker_sk],
        &[sender_signer_sk],
        &mut Some(&mut signers),
        &mut btc_regtest_controller,
    );

    info!("Bootstrapped to Epoch-3.0 boundary, starting nakamoto miner");

    let burnchain = naka_conf.get_burnchain();
    let sortdb = burnchain.open_sortition_db(true).unwrap();
    let (chainstate, _) = StacksChainState::open(
        naka_conf.is_mainnet(),
        naka_conf.burnchain.chain_id,
        &naka_conf.get_chainstate_path_str(),
        None,
    )
    .unwrap();

    let block_height_pre_3_0 =
        NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
            .unwrap()
            .unwrap()
            .stacks_block_height;

    info!("Nakamoto miner started...");
    blind_signer_multinode(
        &signers,
        &[&naka_conf, &naka_conf],
        vec![proposals_submitted, proposals_submitted_2],
    );

    wait_for_first_naka_block_commit(60, &commits_submitted);

    // Mine 2 nakamoto tenures
    for _i in 0..2 {
        next_block_and_mine_commit(
            &mut btc_regtest_controller,
            60,
            &coord_channel,
            &commits_submitted,
        )
        .unwrap();
    }

    let last_tip = NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
        .unwrap()
        .unwrap();
    info!(
        "Latest tip";
        "height" => last_tip.stacks_block_height,
        "is_nakamoto" => last_tip.anchored_header.as_stacks_nakamoto().is_some(),
    );

    // close the current miner
    coord_channel
        .lock()
        .expect("Mutex poisoned")
        .stop_chains_coordinator();
    run_loop_stopper.store(false, Ordering::SeqCst);
    run_loop_thread.join().unwrap();

    // mine a bitcoin block -- this should include a winning commit from
    //  the miner
    // Submit a TX
    let transfer_tx = make_stacks_transfer(
        &sender_sk,
        0,
        send_fee,
        naka_conf.burnchain.chain_id,
        &recipient,
        send_amt,
    );

    // start it back up

    let _run_loop_thread = thread::spawn(move || run_loop_2.start(None, 0));
    wait_for_runloop(&blocks_processed_2);

    info!(" ================= RESTARTED THE MINER =================");

    let tip = NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
        .unwrap()
        .unwrap();
    info!(
        "Latest tip";
        "height" => tip.stacks_block_height,
        "is_nakamoto" => tip.anchored_header.as_stacks_nakamoto().is_some(),
    );

    submit_tx(&http_origin, &transfer_tx);

    wait_for(60, || {
        let tip = NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
            .unwrap()
            .unwrap();
        Ok(tip.stacks_block_height > last_tip.stacks_block_height)
    })
    .unwrap_or_else(|e| {
        let tip = NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
            .unwrap()
            .unwrap();

        error!(
            "Failed to get a new block after restart";
            "last_tip_height" => last_tip.stacks_block_height,
            "latest_tip" => tip.stacks_block_height,
            "error" => &e,
        );

        panic!("{e}")
    });

    // Mine 2 more nakamoto tenures
    for _i in 0..2 {
        next_block_and_mine_commit(
            &mut btc_regtest_controller,
            60,
            &coord_channel_2,
            &commits_submitted_2,
        )
        .unwrap();
    }

    // load the chain tip, and assert that it is a nakamoto block and at least 30 blocks have advanced in epoch 3
    let tip = NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
        .unwrap()
        .unwrap();
    info!(
        "=== Last tip ===";
        "height" => tip.stacks_block_height,
        "is_nakamoto" => tip.anchored_header.as_stacks_nakamoto().is_some(),
    );

    assert!(tip.anchored_header.as_stacks_nakamoto().is_some());

    // Check that we aren't missing burn blocks
    let bhh = u64::from(tip.burn_header_height);
    // make sure every burn block after the nakamoto transition has a mined
    //  nakamoto block in it.
    let missing = test_observer::get_missing_burn_blocks(220..=bhh).unwrap();

    // This test was flakey because it was sometimes missing burn block 230, which is right at the Nakamoto transition
    // So it was possible to miss a burn block during the transition
    // But I don't it matters at this point since the Nakamoto transition has already happened on mainnet
    // So just print a warning instead, don't count it as an error
    let missing_is_error: Vec<_> = missing
        .into_iter()
        .filter(|i| match i {
            230 => {
                warn!("Missing burn block {i}");
                false
            }
            _ => true,
        })
        .collect();

    if !missing_is_error.is_empty() {
        panic!("Missing the following burn blocks: {missing_is_error:?}");
    }

    check_nakamoto_empty_block_heuristics();

    assert!(tip.stacks_block_height >= block_height_pre_3_0 + 4);
}

The text was updated successfully, but these errors were encountered:

obycode · 2025-01-28T19:13:41Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Miner should resume active tenure on restarts #5754

Miner should resume active tenure on restarts #5754

kantai commented Jan 27, 2025

obycode commented Jan 28, 2025

Miner should resume active tenure on restarts #5754

Miner should resume active tenure on restarts #5754

Comments

kantai commented Jan 27, 2025

obycode commented Jan 28, 2025