Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trie-db: Fetch the closest merkle value #199

Merged
merged 35 commits into from
Sep 11, 2023
Merged

Conversation

lexnv
Copy link
Contributor

@lexnv lexnv commented Aug 21, 2023

This PR adds support for fetching the merkle value directly from the trie-db.

  • Adds get_closest_merkle_value on the Trie trait
  • TrieDb iterates over the nibbles of the provided key
    • At each iteration DB is queried for the containing node
    • last value encountered on the path from the root to the key is returned
  • During trie traversal, the internal recorder is updated with trie actions
    • This should make other trie queries to the key more efficient
  • Test with HashedValueNoExtThreshold<1> as trie layout and value update at given key

We could build on this PR to have support for recorded Merkel Value in the cache.

Would love to get your thoughts on this @cheme @arkpar 🙏
(For reference @tomaka to ensure we don't break anything wrt to compatibilities between smoldot and substrate)

Part of paritytech/substrate#14550.

// @paritytech/subxt-team

@lexnv lexnv self-assigned this Aug 21, 2023
@lexnv lexnv requested review from cheme and arkpar August 21, 2023 13:37
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
@@ -644,6 +644,80 @@ fn test_recorder_with_cache_get_hash_internal<T: TrieLayout>() {
}
}

#[test]
fn test_merkle_value() {
Copy link
Member

@arkpar arkpar Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test should also check that the functions works correctly for keys that aren't acuatlly in the tree.
E.g. for a tree with nodes ["AAA"] it should return a closest descendant for the key "AA" but not for "AB". For the tree with nodes ["AAAA", "AABA] closes descendant for the key "A" should be the branch node, etc.

Also there's no need to make changes to the test trie after it is created really.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you don't need to

Argh, the suspense :D

@arkpar
Copy link
Member

arkpar commented Aug 22, 2023

After reading the spec more carefully, I think the logic here requires some changes. We need to return the descendant of the provided key, not just the lower bound. Basically this means traversing the tree as usual, but for Leaf, Extension and NibbledBranch nodes if you run out of key, just check if the current node key slice extends the remainder.

if slice.starts_with(&partial) {
  return Ok(Some(hash)))
else
  return Ok(None)
} 

@arkpar arkpar requested a review from bkchr August 22, 2023 15:59
Copy link
Contributor

@cheme cheme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lexnv , thanks for the work, I got a few question on corner cases, notably inline node (I am not 100% sure, depends of how the spec is understood).

trie-db/src/lookup.rs Outdated Show resolved Hide resolved
trie-db/test/src/triedb.rs Outdated Show resolved Hide resolved
trie-db/test/src/triedb.rs Outdated Show resolved Hide resolved
trie-db/test/src/triedb.rs Outdated Show resolved Hide resolved
trie-db/test/src/triedb.rs Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@cheme cheme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
Worth remembering that if at some point this get used in runtime (especially by a host function), not simply rpc, it would be worth fuzzing a bit.

Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left mainly some nitpicks. However, the return value is wrong. The merkle value is either the node data or the hash, depending on if this is an inline node or not.

trie-db/src/fatdb.rs Outdated Show resolved Hide resolved
trie-db/src/fatdb.rs Outdated Show resolved Hide resolved
trie-db/src/lib.rs Outdated Show resolved Hide resolved
trie-db/src/sectriedb.rs Outdated Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
trie-db/src/lookup.rs Outdated Show resolved Hide resolved
Comment on lines 723 to 738
let mut node = if let Some(cache) = &mut cache {
let node = cache.get_or_insert_node(hash, &mut || get_owned_node(depth))?;

self.record(|| TrieAccess::NodeOwned { hash, node_owned: node });

node
} else {
_owned_node = get_owned_node(depth)?;

self.record(|| TrieAccess::EncodedNode {
hash,
encoded_node: node_data.as_slice().into(),
});

&_owned_node
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we always need to create an owned node, which is a little bit shitty. But fine for now I think.

mut self,
nibble_key: NibbleSlice,
full_key: &[u8],
) -> Result<Option<TrieHash<L>>, TrieHash<L>, CError<L>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type is wrong.

enum MerkleValue {
     /// Is returned if `v.len() < MAX_INLINE_VALUE`.
     Node(Vec<u8>),
     Hash(H),
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did propose to use hash for inline node (generally I would prefer if inline node is something users don't need to be aware off). But I did not double check the rpc proposal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe trie crate api could return this enum, but the rpc return hash of the data value for inline node (this way we can avoid some hashing if this api get use by runtime in the future).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the spec is speaking about the merkle value and not the hash. The point being that if the data is smaller, we save bandwidth etc.

(this way we can avoid some hashing if this api get use by runtime in the future)

This doesn't add any extra hashing. If you need hashing for the runtime, we can just do it where we use it. There would be no extra hashing involved as we currently also hash the inline node manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense! I've added the MerkleValue enum to be explicit about what the trie-db returns.

Then in substrate's RPC layer we could further hash(MerkleValue::Node(..)) such that we return a consistent view of nodes and not care about inline ones. And if this function gets ever called from the runtime we save a Layout:hash() call 🙏

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then in substrate's RPC layer we could further hash(MerkleValue::Node(..)) such that we return a consistent view of nodes and not care about inline ones.

This still goes against the size argument I laid out above. I opened paritytech/json-rpc-interface-spec#88 to be sure that we use the correct value in the RPC.

trie-db/src/lib.rs Outdated Show resolved Hide resolved
trie-db/src/lib.rs Outdated Show resolved Hide resolved
lexnv and others added 3 commits September 11, 2023 12:54
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Signed-off-by: Alexandru Vasile <[email protected]>
@lexnv lexnv merged commit 08a2305 into master Sep 11, 2023
4 checks passed
lexnv added a commit to paritytech/polkadot-sdk that referenced this pull request Sep 18, 2023
…1153)

This PR adds support for fetching the closest merkle value of some key.


Builds on top of
- paritytech/trie#199

Migrates paritytech/substrate#14818 to the
monorepo.
Closes: paritytech/substrate#14550
Closes: #1506

// @paritytech/subxt-team

---------

Signed-off-by: Alexandru Vasile <[email protected]>
Co-authored-by: Sebastian Kunert <[email protected]>
bgallois pushed a commit to duniter/duniter-polkadot-sdk that referenced this pull request Mar 25, 2024
…aritytech#1153)

This PR adds support for fetching the closest merkle value of some key.


Builds on top of
- paritytech/trie#199

Migrates paritytech/substrate#14818 to the
monorepo.
Closes: paritytech/substrate#14550
Closes: paritytech#1506

// @paritytech/subxt-team

---------

Signed-off-by: Alexandru Vasile <[email protected]>
Co-authored-by: Sebastian Kunert <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants