Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dag] broadcast nodes within window till all validators ack #11751

Merged
merged 5 commits into from
Feb 9, 2024

Conversation

ibalajiarun
Copy link
Contributor

Description

Test Plan

Copy link

trunk-io bot commented Jan 24, 2024

⏱️ 5h 45m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-unit-tests 2h 25m 🟩🟩🟩🟩
windows-build 2h 24m 🟩🟩🟩🟩🟩 (+1 more)
run-tests-main-branch 19m 🟩🟩🟥🟥
check-dynamic-deps 12m 🟩🟩🟩🟩🟩 (+1 more)
rust-lints 10m 🟥🟥🟥
general-lints 8m 🟩🟩🟩
semgrep/ci 3m 🟩🟩🟩🟩🟩 (+2 more)
check 2m 🟥🟥🟥🟥 (+1 more)
file_change_determinator 41s 🟩🟩🟩🟩
file_change_determinator 30s 🟩🟩🟩
permission-check 27s 🟩🟩🟩🟩🟩
permission-check 16s 🟩🟩🟩🟩 (+1 more)
permission-check 15s 🟩🟩🟩🟩🟩 (+1 more)
permission-check 12s 🟩🟩🟩🟩 (+1 more)

🚨 3 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
run-tests-main-branch 8m 4m +79%
windows-build 30m 19m +60%
rust-unit-tests 40m 31m +32%

settingsfeedbackdocs ⋅ learn more about trunk.io

@ibalajiarun ibalajiarun force-pushed the balaji/concurrent-rb-handler branch from 2449040 to 89ba377 Compare January 24, 2024 00:43
@ibalajiarun ibalajiarun force-pushed the balaji/concurrent-rb-handler branch from 89ba377 to d16b70c Compare January 24, 2024 15:43
@ibalajiarun ibalajiarun force-pushed the balaji/concurrent-rb-handler branch from d16b70c to b26c1f4 Compare January 25, 2024 22:00
@@ -93,7 +97,7 @@ impl DagDriver {
payload_client,
reliable_broadcast,
time_service,
rb_abort_handle: Mutex::new(None),
rb_handles: Mutex::new(BoundedVecDeque::new(window_size_config as usize)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to use the same window size here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could increase, but let's keep this for now.

fn drop(&mut self) {
if let Some((handle, _)) = self.rb_abort_handle.lock().as_ref() {
handle.abort()
struct BoundedVecDeque<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: probably move it to a util, and use it for the sliding window inside MetadataBackendAdapter too. also small unit tests would be good

@ibalajiarun ibalajiarun force-pushed the balaji/concurrent-rb-handler branch from b26c1f4 to 373112d Compare February 8, 2024 23:31
@ibalajiarun ibalajiarun force-pushed the balaji/concurrent-rb-handler branch from 373112d to 3f6f138 Compare February 9, 2024 03:12
Base automatically changed from balaji/concurrent-rb-handler to main February 9, 2024 04:07
@ibalajiarun ibalajiarun enabled auto-merge (squash) February 9, 2024 05:54

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Feb 9, 2024

✅ Forge suite compat success on aptos-node-v1.8.3 ==> de78ba35c81508f7e778630eccfd0af8d50bd711

Compatibility test results for aptos-node-v1.8.3 ==> de78ba35c81508f7e778630eccfd0af8d50bd711 (PR)
1. Check liveness of validators at old version: aptos-node-v1.8.3
compatibility::simple-validator-upgrade::liveness-check : committed: 4727 txn/s, latency: 6814 ms, (p50: 6500 ms, p90: 10400 ms, p99: 14500 ms), latency samples: 179660
2. Upgrading first Validator to new version: de78ba35c81508f7e778630eccfd0af8d50bd711
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1805 txn/s, latency: 15641 ms, (p50: 19300 ms, p90: 22000 ms, p99: 22500 ms), latency samples: 92080
3. Upgrading rest of first batch to new version: de78ba35c81508f7e778630eccfd0af8d50bd711
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 1875 txn/s, latency: 15487 ms, (p50: 19100 ms, p90: 21900 ms, p99: 22200 ms), latency samples: 91920
4. upgrading second batch to new version: de78ba35c81508f7e778630eccfd0af8d50bd711
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 3189 txn/s, latency: 9890 ms, (p50: 9800 ms, p90: 13800 ms, p99: 14700 ms), latency samples: 140340
5. check swarm health
Compatibility test for aptos-node-v1.8.3 ==> de78ba35c81508f7e778630eccfd0af8d50bd711 passed
Test Ok

Copy link
Contributor

github-actions bot commented Feb 9, 2024

✅ Forge suite realistic_env_max_load success on de78ba35c81508f7e778630eccfd0af8d50bd711

two traffics test: inner traffic : committed: 7444 txn/s, latency: 5139 ms, (p50: 4800 ms, p90: 6400 ms, p99: 12200 ms), latency samples: 3215980
two traffics test : committed: 100 txn/s, latency: 2274 ms, (p50: 2100 ms, p90: 2500 ms, p99: 8600 ms), latency samples: 1840
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.210, avg: 0.197", "QsPosToProposal: max: 0.183, avg: 0.163", "ConsensusProposalToOrdered: max: 0.575, avg: 0.532", "ConsensusOrderedToCommit: max: 0.502, avg: 0.458", "ConsensusProposalToCommit: max: 1.026, avg: 0.989"]
Max round gap was 1 [limit 4] at version 1467638. Max no progress secs was 4.769476 [limit 15] at version 1467638.
Test Ok

@ibalajiarun ibalajiarun merged commit 2b3cce4 into main Feb 9, 2024
55 checks passed
@ibalajiarun ibalajiarun deleted the balaji/extended-rb branch February 9, 2024 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants