Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miner should resume active tenure on restarts #5754

Open
kantai opened this issue Jan 27, 2025 · 1 comment
Open

Miner should resume active tenure on restarts #5754

kantai opened this issue Jan 27, 2025 · 1 comment

Comments

@kantai
Copy link
Member

kantai commented Jan 27, 2025

Miners should resume their active tenures through a restart. Once #5752 merges, the relayer will spawn a new miner thread on restart, but the miner thread will be somewhat confused -- it will try to continue the existing tenure, but also fail because it expects to create a tenure change payload when it produces the first block in the thread.

There's certainly some relatively straight-forward patches to the checks in the miner thread which would make this possible. However, there's already a fair number of "cases" that would need to each be handled with different checks/patches:

  1. A normal BlockFound tenure.
  2. Resuming after a tenure extend into a new bitcoin block
  3. Resuming after a timeout tenure extend
  4. Resuming after a tenure extend over a failed reorg?

In my opinion, the least cludgey thing to do would be to actually dump some transient state of the active mining thread on exit to a file (somewhat like VRF registrations):

#[derive(Serialize, Deserialize)]
struct BlockMinerSerializableState {
    last_block_mined: Option<NakamotoBlock>,
    mined_blocks: u64,
    burn_election_block: BlockSnapshot,
    burn_block: BlockSnapshot,
    parent_tenure_id: StacksBlockId,
    reason: MinerReason,
    tenure_change_time: Instant,
    burn_tip_at_start: ConsensusHash,
}

The relayer could check burn_tip_at_start against the current burnchain tip, and if it matches, start the mining thread with that information. I think we may need to make sure that the relayer didn't also try to start a new mining thread when it figures out it should be mining at that tip (i.e., don't try to start a new BlockFound thread: just let the resumed thread handle it).

Here's a test for the BlockFound case:

#[test]
#[ignore]
/// Test a scenario in which a miner is restarted right before a tenure
///  which they won. The miner, on restart, should begin mining the new tenure.
fn restarting_miner_mid_tenure() {
    if env::var("BITCOIND_TEST") != Ok("1".into()) {
        return;
    }

    let (mut naka_conf, _miner_account) = naka_neon_integration_conf(None);
    naka_conf.miner.activated_vrf_key_path =
        Some(format!("{}/vrf_key", naka_conf.node.working_dir));
    naka_conf.miner.wait_on_interim_blocks = Duration::from_secs(5);
    let sender_sk = Secp256k1PrivateKey::new();
    // setup sender + recipient for a test stx transfer
    let sender_addr = tests::to_addr(&sender_sk);
    let send_amt = 1000;
    let send_fee = 180;
    naka_conf.add_initial_balance(
        PrincipalData::from(sender_addr).to_string(),
        send_amt * 2 + send_fee,
    );
    let sender_signer_sk = Secp256k1PrivateKey::new();
    let sender_signer_addr = tests::to_addr(&sender_signer_sk);
    let mut signers = TestSigners::new(vec![sender_signer_sk]);
    naka_conf.add_initial_balance(PrincipalData::from(sender_signer_addr).to_string(), 100000);
    let stacker_sk = setup_stacker(&mut naka_conf);
    let http_origin = naka_conf.node.data_url.clone();
    let recipient = PrincipalData::from(StacksAddress::burn_address(false));

    test_observer::spawn();
    test_observer::register_any(&mut naka_conf);

    let mut btcd_controller = BitcoinCoreController::new(naka_conf.clone());
    btcd_controller
        .start_bitcoind()
        .expect("Failed starting bitcoind");
    let mut btc_regtest_controller = BitcoinRegtestController::new(naka_conf.clone(), None);
    btc_regtest_controller.bootstrap_chain(201);

    let mut run_loop = boot_nakamoto::BootRunLoop::new(naka_conf.clone()).unwrap();
    let run_loop_stopper = run_loop.get_termination_switch();
    let Counters {
        blocks_processed,
        naka_submitted_commits: commits_submitted,
        naka_proposed_blocks: proposals_submitted,
        ..
    } = run_loop.counters();
    let coord_channel = run_loop.coordinator_channels();

    let mut run_loop_2 = boot_nakamoto::BootRunLoop::new(naka_conf.clone()).unwrap();
    let _run_loop_2_stopper = run_loop.get_termination_switch();
    let Counters {
        blocks_processed: blocks_processed_2,
        naka_submitted_commits: commits_submitted_2,
        naka_proposed_blocks: proposals_submitted_2,
        ..
    } = run_loop_2.counters();
    let coord_channel_2 = run_loop_2.coordinator_channels();

    let run_loop_thread = thread::spawn(move || run_loop.start(None, 0));
    wait_for_runloop(&blocks_processed);
    boot_to_epoch_3(
        &naka_conf,
        &blocks_processed,
        &[stacker_sk],
        &[sender_signer_sk],
        &mut Some(&mut signers),
        &mut btc_regtest_controller,
    );

    info!("Bootstrapped to Epoch-3.0 boundary, starting nakamoto miner");

    let burnchain = naka_conf.get_burnchain();
    let sortdb = burnchain.open_sortition_db(true).unwrap();
    let (chainstate, _) = StacksChainState::open(
        naka_conf.is_mainnet(),
        naka_conf.burnchain.chain_id,
        &naka_conf.get_chainstate_path_str(),
        None,
    )
    .unwrap();

    let block_height_pre_3_0 =
        NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
            .unwrap()
            .unwrap()
            .stacks_block_height;

    info!("Nakamoto miner started...");
    blind_signer_multinode(
        &signers,
        &[&naka_conf, &naka_conf],
        vec![proposals_submitted, proposals_submitted_2],
    );

    wait_for_first_naka_block_commit(60, &commits_submitted);

    // Mine 2 nakamoto tenures
    for _i in 0..2 {
        next_block_and_mine_commit(
            &mut btc_regtest_controller,
            60,
            &coord_channel,
            &commits_submitted,
        )
        .unwrap();
    }

    let last_tip = NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
        .unwrap()
        .unwrap();
    info!(
        "Latest tip";
        "height" => last_tip.stacks_block_height,
        "is_nakamoto" => last_tip.anchored_header.as_stacks_nakamoto().is_some(),
    );

    // close the current miner
    coord_channel
        .lock()
        .expect("Mutex poisoned")
        .stop_chains_coordinator();
    run_loop_stopper.store(false, Ordering::SeqCst);
    run_loop_thread.join().unwrap();

    // mine a bitcoin block -- this should include a winning commit from
    //  the miner
    // Submit a TX
    let transfer_tx = make_stacks_transfer(
        &sender_sk,
        0,
        send_fee,
        naka_conf.burnchain.chain_id,
        &recipient,
        send_amt,
    );

    // start it back up

    let _run_loop_thread = thread::spawn(move || run_loop_2.start(None, 0));
    wait_for_runloop(&blocks_processed_2);

    info!(" ================= RESTARTED THE MINER =================");

    let tip = NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
        .unwrap()
        .unwrap();
    info!(
        "Latest tip";
        "height" => tip.stacks_block_height,
        "is_nakamoto" => tip.anchored_header.as_stacks_nakamoto().is_some(),
    );

    submit_tx(&http_origin, &transfer_tx);

    wait_for(60, || {
        let tip = NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
            .unwrap()
            .unwrap();
        Ok(tip.stacks_block_height > last_tip.stacks_block_height)
    })
    .unwrap_or_else(|e| {
        let tip = NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
            .unwrap()
            .unwrap();

        error!(
            "Failed to get a new block after restart";
            "last_tip_height" => last_tip.stacks_block_height,
            "latest_tip" => tip.stacks_block_height,
            "error" => &e,
        );

        panic!("{e}")
    });

    // Mine 2 more nakamoto tenures
    for _i in 0..2 {
        next_block_and_mine_commit(
            &mut btc_regtest_controller,
            60,
            &coord_channel_2,
            &commits_submitted_2,
        )
        .unwrap();
    }

    // load the chain tip, and assert that it is a nakamoto block and at least 30 blocks have advanced in epoch 3
    let tip = NakamotoChainState::get_canonical_block_header(chainstate.db(), &sortdb)
        .unwrap()
        .unwrap();
    info!(
        "=== Last tip ===";
        "height" => tip.stacks_block_height,
        "is_nakamoto" => tip.anchored_header.as_stacks_nakamoto().is_some(),
    );

    assert!(tip.anchored_header.as_stacks_nakamoto().is_some());

    // Check that we aren't missing burn blocks
    let bhh = u64::from(tip.burn_header_height);
    // make sure every burn block after the nakamoto transition has a mined
    //  nakamoto block in it.
    let missing = test_observer::get_missing_burn_blocks(220..=bhh).unwrap();

    // This test was flakey because it was sometimes missing burn block 230, which is right at the Nakamoto transition
    // So it was possible to miss a burn block during the transition
    // But I don't it matters at this point since the Nakamoto transition has already happened on mainnet
    // So just print a warning instead, don't count it as an error
    let missing_is_error: Vec<_> = missing
        .into_iter()
        .filter(|i| match i {
            230 => {
                warn!("Missing burn block {i}");
                false
            }
            _ => true,
        })
        .collect();

    if !missing_is_error.is_empty() {
        panic!("Missing the following burn blocks: {missing_is_error:?}");
    }

    check_nakamoto_empty_block_heuristics();

    assert!(tip.stacks_block_height >= block_height_pre_3_0 + 4);
}
@obycode
Copy link
Contributor

obycode commented Jan 28, 2025

See also #5526

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Status: 🆕 New
Development

No branches or pull requests

2 participants