perf: remove TrieUpdates::removed_nodes and StorageTrieUpdates::removed_nodes (attempt 2) #13929

kien-rise · 2025-01-22T17:08:19Z

Replaces #13872. This PR is only slightly worse than #13872, but the code changes are minimal.

Checklist

E2E Benchmarks
Add benches/*
Improve PR description

Motivation

Under high load (TPS > 50,000), MemoryOverlayStateProviderRef::trie_state is taking a considerable amount of time (over 200ms). One factor contributing to this is the TrieUpdates::extend_ref function. Optimizing the struct definition of TrieUpdates could help improve the performance of the extend_ref function.

Solution

First, this PR changes the struct definitions of TrieUpdates and StorageTrieUpdates:

pub struct TrieUpdates {
-    pub account_nodes: HashMap<Nibbles, BranchNodeCompact>,
-    pub removed_nodes: HashSet<Nibbles>,
+    pub changed_nodes: HashMap<Nibbles, Option<BranchNodeCompact>>,
     pub storage_tries: B256HashMap<StorageTrieUpdates>,
 }

pub struct StorageTrieUpdates {
     pub is_deleted: bool,
-    pub storage_nodes: HashMap<Nibbles, BranchNodeCompact>,
-    pub removed_nodes: HashSet<Nibbles>,
+    pub changed_nodes: HashMap<Nibbles, Option<BranchNodeCompact>>,
 }

Next, this PR replaces the following steps in fn TrieUpdates::extend_ref:

self.account_nodes.retain(|nibbles, _| !other.removed_nodes.contains(nibbles));
self.account_nodes.extend(exclude_empty_from_pair(other.account_nodes.iter().map(|(k, v)| (k.clone(), v.clone()))));
self.removed_nodes.extend(exclude_empty(other.removed_nodes.iter().cloned()));

by

self.account_nodes.extend(exclude_empty_from_pair(other.account_nodes.iter().map(|(k, v)| (k.clone(), v.clone()))));

Similar changes is applied to StorageTrieUpdates::extend_ref.

Criterion Benchmarks

Note: the current criterion benchmark does not produce very stable results.

[Before]
ERC20                   time:   [503.56 ms 503.85 ms 504.13 ms]
Raw Transfer            time:   [205.55 ms 205.87 ms 206.19 ms]
Uniswap                 time:   [55.124 ms 55.255 ms 55.393 ms]

[After]
ERC20                   time:   [476.01 ms 476.36 ms 476.72 ms]
Raw Transfer            time:   [112.22 ms 112.33 ms 112.49 ms]
Uniswap                 time:   [31.368 ms 31.551 ms 31.738 ms]

E2E Benchmarks

(unit: μs)	Before	After	Ratio
ERC20	585111076	523309918	0.8943770499
Raw Transfer	323457659	298599422	0.923148405
Uniswap	1254878698	1151877682	0.9179195438

Before and After

Before

erc20-from16-1914-super-low-dependency.zip (30600/150000)
30,601.00 tps, 1,056,914,565.83 gps, no chain lag, 585111076/609686114 μs
raw-transfer-from05-1723-super-low-dependency.zip (54600/280000)
54,594.73 tps, 1,146,512,812.78 gps, no chain lag, 323457659/332195794 μs
uniswap-from14-1848-super-low-dependency.zip (6330/25320)
6,331.00 tps, 882,451,082.10 gps, no chain lag, 1254878698/1270738569 μs

After

erc20-from16-1914-super-low-dependency.zip (30600/150000)
30,601.00 tps, 1,056,921,850.84 gps, no chain lag, 523309918/561875948 μs
raw-transfer-from05-1723-super-low-dependency.zip (54600/280000)
54,601.00 tps, 1,146,644,512.25 gps, no chain lag, 298599422 μs/315966032 μs
uniswap-from14-1848-super-low-dependency.zip (6330/25320)
6,331.00 tps, 882,450,529.38 gps, no chain lag, 1151877682/1180946313 μs

Other benchmarks

"785bc168-9976731c-erc20-30600"
{"tps": 30601.0, "gps": 1056921850.8367347, "is_chain_lagged": false, "chain_lag_distance": 0}
"785bc168-9976731c-erc20-31000"
{"tps": 31001.0, "gps": 1070736094.1406412, "is_chain_lagged": false, "chain_lag_distance": 0}
"785bc168-9976731c-erc20-32000"
{"tps": 32001.0, "gps": 1105272921.4311633, "is_chain_lagged": false, "chain_lag_distance": 1}
"785bc168-9976731c-erc20-33000"
{"tps": 33001.0, "gps": 1139812889.5115511, "is_chain_lagged": false, "chain_lag_distance": 1}
"785bc168-9976731c-erc20-34000"
{"tps": 34001.0, "gps": 1174346187.1292517, "is_chain_lagged": true, "chain_lag_distance": 4}
"785bc168-9976731c-erc20-36000"
{"tps": 36001.0, "gps": 1243424508.7788463, "is_chain_lagged": true, "chain_lag_distance": 20}
"785bc168-9976731c-erc20-40000"
{"tps": 40001.0, "gps": 1381586516.224, "is_chain_lagged": true, "chain_lag_distance": 61}
"785bc168-9976731c-raw-transfer-54600"
{"tps": 54601.0, "gps": 1146644512.2513661, "is_chain_lagged": false, "chain_lag_distance": 1}
"785bc168-9976731c-raw-transfer-55000"
{"tps": 55001.0, "gps": 1155044519.4926472, "is_chain_lagged": false, "chain_lag_distance": 0}
"785bc168-9976731c-raw-transfer-56000"
{"tps": 56001.0, "gps": 1176044515.2560747, "is_chain_lagged": false, "chain_lag_distance": 1}
"785bc168-9976731c-raw-transfer-57000"
{"tps": 57001.0, "gps": 1197044511.5361216, "is_chain_lagged": false, "chain_lag_distance": 0}
"785bc168-9976731c-raw-transfer-58000"
{"tps": 58001.0, "gps": 1218044525.3953488, "is_chain_lagged": false, "chain_lag_distance": 1}
"785bc168-9976731c-raw-transfer-60000"
{"tps": 60001.0, "gps": 1260044514.0, "is_chain_lagged": true, "chain_lag_distance": 5}
"785bc168-9976731c-raw-transfer-64000"
{"tps": 64001.0, "gps": 1344044510.4796574, "is_chain_lagged": true, "chain_lag_distance": 26}
"785bc168-9976731c-uniswap-6330"
{"tps": 6331.0, "gps": 882450529.3786408, "is_chain_lagged": false, "chain_lag_distance": 0}
"785bc168-9976731c-uniswap-6400"
{"tps": 6401.0, "gps": 892212483.2733675, "is_chain_lagged": false, "chain_lag_distance": 0}
"785bc168-9976731c-uniswap-6480"
{"tps": 6481.0, "gps": 903381443.1144955, "is_chain_lagged": false, "chain_lag_distance": 1}
"785bc168-9976731c-uniswap-6500"
{"tps": 6501.0, "gps": 906146040.2489707, "is_chain_lagged": false, "chain_lag_distance": 0}
"785bc168-9976731c-uniswap-6540"
{"tps": 6541.0, "gps": 911705913.7027911, "is_chain_lagged": true, "chain_lag_distance": 2}
"785bc168-9976731c-uniswap-6580"
{"tps": 6581.0, "gps": 917291024.3644142, "is_chain_lagged": false, "chain_lag_distance": 1}

…ed_nodes

mattsse

I think I almost understood this, but I'm unequipped to review this in detail.

ptal @rkrasiuk

imo if feasible we should try to get this in, because it makes sense why this is significantly more performant

mattsse · 2025-01-24T15:05:54Z

crates/engine/tree/src/tree/trie_updates.rs

-    account_nodes: HashMap<Nibbles, EntryDiff<Option<BranchNodeCompact>>>,
-    removed_nodes: HashMap<Nibbles, EntryDiff<bool>>,


this refactor isn't straight forward to me, unclear how account/removed nodes translate to task/regular/database

I'd appreciate a few additional docs

mattsse · 2025-01-24T15:06:43Z

crates/engine/tree/src/tree/trie_updates.rs

-        .storage_nodes
-        .keys()
-        .chain(regular.storage_nodes.keys())
+    for key in Iterator::chain(task.changed_nodes.keys(), regular.changed_nodes.keys())


why do we need the fully qualified syntax here

mattsse · 2025-01-24T15:08:43Z

crates/engine/tree/src/tree/trie_updates.rs

+    task: &Option<Option<BranchNodeCompact>>,
+    regular: &Option<Option<BranchNodeCompact>>,
+    database: &Option<BranchNodeCompact>,


I see that task/regular/database is mostlikely coming from here, so perhaps @rkrasiuk needs to fill in the blanks

mattsse · 2025-01-24T15:09:23Z

crates/trie/common/src/updates.rs

-    /// Collection of removed intermediate account nodes indexed by full path.
-    #[cfg_attr(any(test, feature = "serde"), serde(with = "serde_nibbles_set"))]
-    pub removed_nodes: HashSet<Nibbles>,
+    pub changed_nodes: HashMap<Nibbles, Option<BranchNodeCompact>>,


also not immediately clear how removed translated to changed here

kien-rise · 2025-01-24T16:16:50Z

I am gonna convert this PR to Draft (to prevent any accidental merge) because #13976 is (potentially) a simpler version.

kien-rise requested review from rkrasiuk, Rjected, shekhirin, rakita, joshieDo, mattsse, fgimenez and onbjerg as code owners January 22, 2025 17:08

kien-rise mentioned this pull request Jan 22, 2025

[WIP] perf: remove TrieUpdates::removed_nodes #13872

Closed

1 task

kien-rise marked this pull request as draft January 22, 2025 17:09

emhane added the C-perf A change motivated by improving speed, memory usage or disk footprint label Jan 22, 2025

kien-rise force-pushed the 0007-removed-nodes branch from 9f5db66 to d0bee8b Compare January 23, 2025 19:25

kien-rise added 3 commits January 24, 2025 02:43

feat: add benches/trie_state.rs

5ff017f

feat: remove TrieUpdates::removed_nodes and StorageTrieUpdates::remov…

4beebd0

…ed_nodes

feat: update benches/trie_state.rs

0d0cc88

kien-rise force-pushed the 0007-removed-nodes branch from d0bee8b to 0d0cc88 Compare January 23, 2025 19:45

kien-rise marked this pull request as ready for review January 23, 2025 20:03

kien-rise requested a review from gakonst as a code owner January 23, 2025 20:03

kien-rise changed the title ~~[WIP] perf: remove TrieUpdates::removed_nodes and StorageTrieUpdates::removed_nodes (attempt 2)~~ perf: remove TrieUpdates::removed_nodes and StorageTrieUpdates::removed_nodes (attempt 2) Jan 23, 2025

mattsse requested changes Jan 24, 2025

View reviewed changes

kien-rise mentioned this pull request Jan 24, 2025

[WIP] perf: only perform insert_storage_updates if storage_updates is not empty #13976

Draft

2 tasks

kien-rise marked this pull request as draft January 24, 2025 16:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: remove TrieUpdates::removed_nodes and StorageTrieUpdates::removed_nodes (attempt 2) #13929

perf: remove TrieUpdates::removed_nodes and StorageTrieUpdates::removed_nodes (attempt 2) #13929

kien-rise commented Jan 22, 2025 •

edited

Loading

mattsse left a comment

mattsse Jan 24, 2025 •

edited

Loading

mattsse Jan 24, 2025

mattsse Jan 24, 2025

mattsse Jan 24, 2025

kien-rise commented Jan 24, 2025 •

edited

Loading

		account_nodes: HashMap<Nibbles, EntryDiff<Option<BranchNodeCompact>>>,
		removed_nodes: HashMap<Nibbles, EntryDiff<bool>>,

perf: remove TrieUpdates::removed_nodes and StorageTrieUpdates::removed_nodes (attempt 2) #13929

Are you sure you want to change the base?

perf: remove TrieUpdates::removed_nodes and StorageTrieUpdates::removed_nodes (attempt 2) #13929

Conversation

kien-rise commented Jan 22, 2025 • edited Loading

Motivation

Solution

Criterion Benchmarks

E2E Benchmarks

Before

After

mattsse left a comment

Choose a reason for hiding this comment

mattsse Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

mattsse Jan 24, 2025

Choose a reason for hiding this comment

mattsse Jan 24, 2025

Choose a reason for hiding this comment

mattsse Jan 24, 2025

Choose a reason for hiding this comment

kien-rise commented Jan 24, 2025 • edited Loading

kien-rise commented Jan 22, 2025 •

edited

Loading

mattsse Jan 24, 2025 •

edited

Loading

kien-rise commented Jan 24, 2025 •

edited

Loading