Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: enable patient block commits #5835

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from
Open

Conversation

kantai
Copy link
Member

@kantai kantai commented Feb 13, 2025

This PR does a number of things to achieve #5821:

  1. It fixes buggy logic in the way that the timeout previously worked.
  2. It adds logic to immediately issue a commit in the event of an empty sortition (this may be too aggressive, but its basically just the behavior today for this case), so that miners don't wait for a new tenure to start when a new tenure won't necessarily start.
  3. It refactors some of the logic around block commit issuance (fixing one or two potential bugs there).
  4. It removes a slow memory leak (last_commits)
  5. It sets the default timeout to 40 seconds (more on this choice below)

Tests

This PR doesn't add new integration tests (it includes new unit tests for the burn block timeout, though). Instead, it reconfigures and adds assertions to two nakamoto integrations tests to make sure that:

  1. In reorg-free block production settings, if the timeout is set high enough, no RBFs are necessary (i.e., 1 commit is sent per tenure)
  2. In an empty sortition setting, if the timeout is set very high, miners are still able to submit block commits after an empty sortition (when a tenure extension occurs and not a new tenure).

Setting the default timeout

I collected log data from a stacks-node (running a mock miner) to see how much time there was between processing a sortition in the relayer thread (which the miner uses to issue commits) and the first block from that tenure being processed by the coordinator.

The log lines I was interested in were:

[testnet/stacks-node/src/nakamoto_node/relayer.rs:713] Process sortition ...
[stackslib/src/chainstate/nakamoto/mod.rs:2130] Advanced to new tip! ...

After downloading a JSON dump of these from a grafana/loki collector:

def fetch_delta_sortition_advancing(json_file):
    with open(json_file, 'r') as file:
        data = json.load(file)
    sortitions_map = {}
    for entry in data:
        if entry["fields"]["message"].startswith("Relayer: Process sortition"):
            ts = int(entry["timestamp"]) / 10**9
            ch = entry["fields"]["sortition_ch"]
            sortitions_map[ch] = ts
    arrivals_map = {}
    for entry in data:
        if entry["fields"]["message"].startswith("Advanced to new tip"):
            ts = int(entry["timestamp"]) / 10**9
            ch = entry["fields"]["message"].split(" ")[4].split("/")[0]
            if ch not in arrivals_map:
                arrivals_map[ch] = ts
            else:
                arrivals_map[ch] = min(arrivals_map[ch], ts)
    output = []
    for ch_k in arrivals_map:
        if ch_k in sortitions_map:
            output.append(arrivals_map[ch_k] - sortitions_map[ch_k])
        else:
            print(f"Did not find sortition processing info for {ch_k}")
    return np.sort(np.array(output))

def run_delta_times():
    deltas = fetch_delta_sortition_advancing(JSON_FILE)
    print(f"Fetched {len(deltas)} timestamps")
    pctiles = [50, 75, 90, 95, 99, 99.9]
    results =  np.percentile(deltas, pctiles)
    print(pctiles)
    print(results)

The resulting data:

Fetched 217 timestamps
[50, 75, 90, 95, 99, 99.9]
[ 38.6575458   65.70518589  80.60954256  86.95423799 104.41594109
 182.08486128]

So, the median time is 39 seconds, which informs the 40 second setting in this PR.

* fix block commit wait logic (so that it actually waits the configured time)
* cleanup/refactor some of the block commit logic
* add test assertion for block commit waits (assert that only 1 commit is submitted when the wait time is high)
* add test assertions for immediate commits on empty sortition
* light test cleanup/refactors
@kantai kantai requested review from a team as code owners February 13, 2025 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant