From a7cdc1c78fa9bc2f02bc4be269fa772cb6ff2750 Mon Sep 17 00:00:00 2001 From: Damir Vodenicarevic Date: Tue, 21 Nov 2023 11:24:18 +0100 Subject: [PATCH 1/2] 1st_pass --- documentation/specification.adoc | 157 +++++++++++++++---------------- 1 file changed, 78 insertions(+), 79 deletions(-) diff --git a/documentation/specification.adoc b/documentation/specification.adoc index a39ed51..1f666ab 100644 --- a/documentation/specification.adoc +++ b/documentation/specification.adoc @@ -10,31 +10,31 @@ === Scope of the document -This specification is the entrypoint for a developer to contribute to the project and it provides all the guidelines to organize the work efficiently. This document should be enough to work as a team with all the necessary explanations and overview of the whole project. This document will evolve over time. +This specification is the entrypoint for developers to contribute to the project and provides all the guidelines to organize the work efficiently. This document should be enough to work as a team with all the necessary explanations and overview of the whole project. This document will evolve over time. === Target of the document -This document should be a starting point and a reference for all the developers that contribute and a source of documentation about the high level of the project. +This document should be a starting point and a reference for all the developers that contribute and a high level source of documentation about the project. -A technical expertise in Rust is required to understand the interfaces. The rest of the document must be readable by someone with little knowledge of the technical principles we will work on as soon as he follows the vocabulary section and goes further in researches if he doesn't understand. +A technical expertise in Rust is required to understand the interfaces. The rest of the document is readable with limited knowledge of the technical principles involved. === Concepts and Vocabulary ==== Besu Bonsai storage Besu Bonsai storage is an advanced storage management system developed by HyperLedger for their Ethereum client, Besu. -For a detailed specification, you can refer link:https://hackmd.io/@kt2am/BktBblIL3[here]. +For a detailed specification, follow this link:https://hackmd.io/@kt2am/BktBblIL3[link]. ==== Trie -Unlike a conventional tree, a trie is designed so that each node (except the root) represents a byte. Each path descending the tree can symbolize a key in the form of a byte array. +Unlike a conventional tree, a trie is designed so that each node (except the root) represents a byte. Each path descending the trie can symbolize a key in the form of a byte array. Each key is associated to an arbitrary value, making the trie a key-value data structure. To better understand, consider the following illustration: image:https://upload.wikimedia.org/wikipedia/commons/b/be/Trie_example.svg[Tree structure,200,role="center-image"] -In this trie, connected nodes form keys by appending their byte values. For instance, the trie contains keys like **A** - with a value of 15, **to** - 7, **tea** - 3, and so on. However, **t** or **te** aren't keys but merely prefixes. +In this trie, connected nodes form keys by appending the key byte they represent. For instance, the trie contains keys like **A** - with a value of 15, **to** - 7, **tea** - 3, and so on. However, **t** or **te** aren't keys but merely prefixes. -For a deeper dive into tries, you can explore link:https://en.wikipedia.org/wiki/Trie[here]. The advantages of the trie data structure are detailed link:https://www.geeksforgeeks.org/advantages-trie-data-structure/[here]. +For a deeper dive into tries, follow this link:https://en.wikipedia.org/wiki/Trie[link]. The advantages of the trie data structure are detailed link:https://www.geeksforgeeks.org/advantages-trie-data-structure/[here]. ==== Radix trie @@ -44,11 +44,11 @@ image:https://upload.wikimedia.org/wikipedia/commons/a/ae/Patricia_trie.svg[Radi In this trie, the nodes **om**, **ub**, **ulus** ... are merged into a single node, as they are the only children of their parent. Instead of needing 7 nodes to represent the key **romulus**, each letter having its node, this space-optimized representation only needs 3, **r**, **om** and **ulus**. -For a deeper dive into radix tries, you can explore link:https://en.wikipedia.org/wiki/Radix_tree[here]. +For a deeper dive into radix tries, follow this link:https://en.wikipedia.org/wiki/Radix_tree[link]. ==== PATRICIA tree -A PATRICIA tree, which stands for Practical Algorithm to Retrieve Information Coded in Alphanumeric, is a special variant of the radix trie where the radix is equal to 2. This implies that each node compare 1-bit portion of the key and have at most two children (0 or 1). +A PATRICIA tree, which stands for Practical Algorithm to Retrieve Information Coded in Alphanumeric, is a special variant of the radix trie where the radix is equal to 2. This implies that each node carries a 1-bit portion of the key and has at most two children (child 0 and child 1). For a deeper explanation of the difference between PATRICIA tree and radix trie, you can explore link:https://cs.stackexchange.com/a/63060[here]. @@ -64,8 +64,10 @@ In this depiction, hashes 0-0 and 0-1 represent the hash values of data blocks L For comprehensive information on the Merkle Tree, visit link:https://en.wikipedia.org/wiki/Merkle_tree[here]. The benefits of the Merkle Tree data structure are outlined link:https://www.geeksforgeeks.org/blockchain-merkle-trees/#Advantages%20of%20Merkle%20Tree:~:text=longest%2C%20valid%20blockchain.-,Advantages%20of%20Merkle%20Tree,-Efficient%20verification%3A[here]. +One important feature of Merkle Trees is that they allow exhibiting compact proofs of existence of an element in the tree. The proof size is logarithmic in the number of elements in the tree. + ==== Merkle-Patricia Trie -A Merkle-Patricia Trie (MPT) is a combination of a Merkle Tree and a Patricia Tree. This data structure is famous because it's being used by Ethereum to store the state of an Ethereum blockchain. Ethereum version of the MPT comprises of 3 types of nodes: +A Merkle-Patricia Trie (MPT) is a combination of a Merkle Tree and a Patricia Tree. This data structure is famous because it is being used by Ethereum to store the state of an Ethereum blockchain. The Ethereum version of the MPT is composed of 3 types of nodes: * **Branch:** A node with up to 16 child links, each corresponding to a hex character. * **Extension:** A node storing a key segment with a common prefix and a link to the next node. @@ -75,7 +77,7 @@ Here's a visual representation: image:https://i.stack.imgur.com/YZGxe.png[Patricia Merkle Trie in Ethereum,450,role="center-image"] -In this depiction, the key a77d397, having the value of 0.12 ETH, is stored using 5 nodes (a7 - extensible node, 7 - branch node, d3 extensible node, 9 - branch node, 7 - leaf node). +In this depiction, the key a77d397, having the value of 0.12 ETH, is stored using 5 nodes (a7 - extension node, 7 - branch node, d3 extension node, 9 - branch node, 7 - leaf node). === External Resources @@ -95,27 +97,27 @@ Substrate:: link:https://github.com/paritytech/polkadot-sdk/tree/master/substrat ==== Caller -The library is designed for callers who need efficient data management operations, such as retrieval, storage, or deletion. Callers can either use the library directly or through another intermediary library. The primary advantage is that callers can utilize the library without delving into its underlying implementation and can choose a database implementation that suits their needs. +The library is designed for callers who need a key-value data structure with efficient data management operations for retrieval, storage and deletion, while maintaining a global fingerprint (hash) of the whole structure that allows for compact proofs of element existence. Callers can either use the library directly or through another intermediary library. The primary advantage is that callers can utilize the library without delving into its underlying implementation and can choose a database implementation that suits their needs. ==== High-Level Interface -This serves as the library's main entry point and the primary interface for callers. It simplifies interactions with the library, ensuring that the caller only engages with this interface, abstracting away the complexities of the underlying processes. +This interface serves as the library's main entry point and the primary interface for callers. It simplifies interactions with the library, ensuring that the caller only engages with this interface, abstracting away the complexities of the underlying processes. ==== Accumulator -The accumulator plays a pivotal role in state management. It facilitates the addition of new states and retrieves states at specific point in time. The high-level interface leverages the accumulator for these tasks. +The accumulator plays a pivotal role in the management of the state of the data structure. It facilitates the addition of new states and retrieves states at specific point in time. The high-level interface leverages the accumulator for these tasks. ==== Trie Trie is the chosen data structure for data storage within the library. Both the accumulator and the high-level interface utilize the Trie for data operations. -==== Trie Log +==== Trie Logs -Trie log capture batches of modifications, detailing every change made during a block's processing. These log are invaluable to the accumulator when it needs to roll back or roll forward to a particular state. +Trie logs capture batches of modifications, detailing every change made during the processing of a "block" which is an atomic batch of modifications that can for example represent the changes caused by the execution of a Block in a blockchain. These log are required to the accumulator when it needs to roll back or roll forward to a particular state. ==== Database Interface -Serving as the persistent storage mechanism, the database ensures data longevity by saving it to the disk. While various library components rely on the database for low-level data management, its interface is implemented by the caller. This design choice maximizes abstraction and portability, allowing the library to be adaptable across different database implementations. +Serving as the underlying persistent storage mechanism, the database ensures data longevity by saving it to the disk. While various library components rely on the database for low-level data management, its interface is implemented by the caller. This design choice maximizes abstraction and portability, allowing the library to be adaptable across different database implementations. === Diagrams @@ -190,46 +192,46 @@ participant TrieLog as TrieLog participant Accumulator as Accumulator participant Database as Database -== New block == +== Processign a block of changes == autonumber 1 -Madara -> Interface : New batch of data to save (block processing) +Caller -> Interface : New batch of data to save Interface -> Trie : Modify the trie and save in the flat DB Trie -> Database : Fetch the nodes Database -> Trie : Return the nodes -Trie -> Database : Save all the modifications on the DB -Trie -> TrieLog : Save modifications in a trielog -TrieLog -> Database : Save the TrieLog in the database -TrieLog -> Accumulator : Save the TrieLog to the Accumulator -Accumulator -> Database : Each X block save a snapshot -Trie -> Madara : (Nothing or a signal to know it's finished or root hash TBD) +Trie -> Database : Save all modifications to DB +Trie -> TrieLog : Save modifications in the Trie Logs +TrieLog -> Database : Save the Trie Logs in the database +TrieLog -> Accumulator : Save the Trie Logs to the Accumulator +Accumulator -> Database : Save a snapshot every N batches +Trie -> Caller : Result of the modifications autonumber stop -== Asking for a specific state at X == +== Asking for a specific state X == autonumber 1 -Madara -> Interface : Ask for a state X +Caller -> Interface : Ask for a state X Interface -> Accumulator : Ask for the state Accumulator -> Accumulator : Is it my current state ? -Accumulator -[#green]> Madara : Return the state +Accumulator -[#green]> Caller : Return the state autonumber stop autonumber 4 1 -Accumulator -[#red]> Accumulator : Can I rollback/rollforward to this state \n using the trie logs I know -Accumulator -[#green]> Accumulator : If yes I apply them -Accumulator -> Madara : Return the state +Accumulator -[#red]> Accumulator : Can we rollback/rollforward to this state \n using the Trie Logs +Accumulator -[#green]> Accumulator : If yes, apply them +Accumulator -> Caller : Return the state autonumber stop autonumber 5 1 Accumulator -[#red]> Database : Ask for the closest snapshot -Database -> Accumulator : Return the clostest snapshot -Accumulator -> Accumulator : Is the closest snapshot the exact state ? -Accumulator -[#green]> Madara : Return the state +Database -> Accumulator : Return the closest snapshot +Accumulator -> Accumulator : Is the closest snapshot exactly at the asked state ? +Accumulator -[#green]> Caller : Return the snapshot state autonumber stop autonumber 8 1 -Accumulator -[#red]> Database : Ask for the trie logs to pass from the \n snapshot to asked state +Accumulator -[#red]> Database : Ask for the trie logs to derive the asked state \n from the snapshot Database -> Accumulator : Return the trie logs Accumulator -> Accumulator : Apply the trie logs -Accumulator -> Madara : Return the state +Accumulator -> Caller : Return the state (or an error) @enduml ---- @@ -237,26 +239,23 @@ Accumulator -> Madara : Return the state === Trie -The Trie is the first and central component of the Besu storage system. For this part, we will use the crate [Trie from paritytech](https://github.com/paritytech/trie), which provides a structure of a PMT. This choice has been made for multiple reasons: +The Trie is the central component of the Besu storage system. To avoid reimplementing a PMT, we use the crate [Trie from paritytech](https://github.com/paritytech/trie) which provides a standard PMT. This choice was made for multiple reasons: -- It allows us to re-use code to build PMTs. +- It allows avoid re-implementing a PMT. - It provides the flexibility to create Tries formatted for different blockchains. -- The code has a lot of generics and gives us the possibility to make some modifications to the PMT structure easily. - -The crate provides different sub-crates. We will be able to re-use some of them and take inspiration from others to create our implementation. +- The code has a lot of generics and gives us the possibility to make modifications to the PMT structure easily. -We will only use the sub-crate `trie-db` and re-implement the keys, database, and layout implementations. +We only use the sub-crate `trie-db` and override its keys, database, and layout implementations. +However, this sub-crate only solves part of the problem: +- In a Bonsai Trie, we store nodes directly by their location, while the Trie crate stores them by hash +- The Trie crate does not allow the implementation of trie logs, which require some modifications to the crate code -Using the sub-crate directly in the project and configuring it with the good traits implementations isn't enough for us because: +Given those constraints, we forked the Trie crate while minimizing the changes to the code. +Our modifications make the crate more generic and are being proposed to the maintainers as an upstream PR. -- In a Bonsai Trie, we store nodes directly by their location and not by hash, and in the Trie crate, they use a lot of hashes that we will not need. -- With the current understanding of the crate, it will be hard to implement the trie logs. We will need to make some modifications to the crate to be able to do it. +==== Attributes -We want to make the minimum modifications to our forked crate to be able to, maybe in the future, create pull requests to implement our needs in a generic way. - -==== Attribute - -No special attribute needs to be defined here. All the traits are detailed above. +No attributes need to be defined. All traits are detailed below. ==== Traits/Implementations @@ -270,26 +269,26 @@ pub trait Trie where where: ID: Id { - // Creates a new Trie + // Create a new Trie fn new() -> Result; // Insert a new key in the batch of changes to be applied to the trie fn insert(&mut self, key: &[u8], value &TrieValue) -> Result<(), TrieError>; - // Get a value saved in a key + // Get the value associated to a key fn get(&self, key: &[u8]) -> Result, TrieError>; - // Delete a key/value in the trie in the batch of changes to be applied to the trie + // Delete a key/value in the trie fn delete(&mut self, key: &[u8]) -> Result<(), TrieError>; - // Apply to the db the batch of changes + // Apply the batch of changes and save it to DB fn commit(&mut self) -> Result<(), TrieError>; // Get the root of the trie with the current state on the database fn root(&self) -> Result - // Go to a specific id in the past or future using trie logs + // Go to a specific state id in the past or future using trie logs fn goto(&mut self, id: ID) -> Result<(), TrieError>; } ---- ==== Database trait -The crate Trie from paritytech already provides a nice database trait described [here](https://github.com/paritytech/trie/blob/1645fddec8e5461d5aca7dd880303042b8527465/hash-db/src/lib.rs#L128). We will reuse the same trait for this part of the project and maybe modify in the project. +The Trie crate from paritytech already provides a nice database trait described [here](https://github.com/paritytech/trie/blob/1645fddec8e5461d5aca7dd880303042b8527465/hash-db/src/lib.rs#L128). We reuse the same trait for this part of the project. ==== Child encoding @@ -297,20 +296,20 @@ In the sub-crate `trie-db`, the children of a branch node are either inline or r ==== Node encoding -The sub-crate reference-trie gives the implementations of `trait NodeCodec` (that manages the serialization/deserialization of nodes to store them in DB) with the same behavior as the one used on Substrate. We will re-use it as it corresponds to our needs and gives us more compatibility with the existing code in Substrate. +The sub-crate reference-trie gives the implementations of `trait NodeCodec` (that manages the serialization/deserialization of nodes to store them in DB) with the same behavior as the one used on Substrate. We re-use it as it corresponds to our needs and gives us more compatibility with existing Substrate code. === Accumulator -The accumulator is a key component of the crate. It will interact with nearly all the rest of the crate, and so it needs to be well defined. +The accumulator is a key component of the crate that interacts with nearly all the rest of the crate. -==== Attribute +==== Attributes - The trie - The FlatDB - An ID -- A list of all the snapshots he knows -- A list of all the changes to be applied along with their previous values (`HashMap>, Option>)>`) -- A list of TrieLogs he knows +- A list of known snapshots +- A list of changes to be applied along with their previous values (`HashMap>, Option>)>`) +- A list of known TrieLogs - A database object ==== Traits/Implementations @@ -327,13 +326,13 @@ where: // Increase the point in time to start a new batch of changes and clear the previous one fn new_block(&mut self, block_number: u64); - // Ask for the read of a key (should return also with modifications that are not applied to the database yet) + // Read a key, takes into account the modifications that are not applied to the database yet fn get(&self, key: &[u8]) -> Result; - // Add a modification to set a new value to a key + // Inset or replace an entry fn insert(&mut self, value: TrieValue) -> Result<(), AccumulatorError>; - // Get trie log of a specific block + // Get the trie log of a specific block fn get_trielog(&self) -> Result; // Get the current block number @@ -345,21 +344,21 @@ where: // Rollbackward (must have all the infos to go from the current block to the block of the trielog) fn rollbackward(&mut self, trielog: TrieLog) -> Result<(), AccumulatorError> - // Set the accumulator to a special point in time; he should have all the snapshots and trie logs to navigate to this point + // Set the accumulator to a special point in time; all required snapshots and trie logs to navigate to this point must be available fn goto(&mut self, id: ID) -> Result<(), AccumulatorError>; - // Apply all the changes to the database and create a trielog (and a snapshot if we are at X point in time) with all the changes and clear them. + // Apply all the changes to the database and create a trielog (and automatic snapshots) with all the changes, clear the changes. fn commit(&mut self); } ---- === Trie logs -Trie logs allow us to store a batch of modifications to the trie to be applied to a state. +Trie logs store a batch of modifications to the trie to be applied to a state. ==== Attribute -- In each trie logs, we should save all keys/values that are modified within `HashMap>, Option>)>`. +- In each trie log, we should save all keys/values that are modified within `HashMap>, Option>)>`. - An ID ==== Traits/Implementations @@ -379,11 +378,11 @@ where: === Database -The database implementation must be done in a generic way to be able to change the database as we want. We should provide a first implementation of the database using RocksDB. +The database implementation is generic on the underlying database. We provide a first implementation of the database using RocksDB. -All of the methods will take an optional transaction type that will allow making transactional modifications to the database if the database type allows it. If the transaction object is defined, we don't update each time directly in the db but place it in the TX and after commit it directly. +All methods take an optional transaction type that allows making transactional modifications to the database if the database type allows it. If the transaction object is provided, we don't update the DB directly but accumulate the changes into the provided TX, and commit it afterwards. -==== Attribute +==== Attributes - A connector to the database @@ -398,22 +397,22 @@ pub enum DatabaseError { pub trait BonsaiDatabase { fn new(path_to_database: &str) -> Self - // Insert key value in trie + // Insert an entry in the trie fn insert(&mut self, key: &[u8], value: &[u8]) -> Result<(), DatabaseError>; - // Remove key value in trie + // Remove an entry from the trie fn remove(&mut self, key: &[u8]) -> Result<(), DatabaseError>; - // Get key value in trie + // Get a value in trie fn get(&self, key: &[u8]) -> Result, DatabaseError>; - // Contains key in trie + // Check if the key is in trie fn contains(&self, key: &[u8]) -> Result; - // Put in TRIE_LOG column + // PUT operation in TRIE_LOG column fn put_trie_log(&mut self, key: &[u8], value: &[u8]) -> Result<(), DatabaseError>; - // Get in TRIE_LOG column + // GET operation in TRIE_LOG column fn get_trie_log(&self, key: &[u8]) -> Result, DatabaseError>; // Generate a snapshot @@ -434,7 +433,7 @@ For now, the interface is the same as the one of the accumulator. === E2E & Benchmarks -To ensure that our code is production-ready and matches the expectations of the client, we must prove it using tests and benchmarks. +To ensure that our code is production-ready, we must prove unit tests and benchmarks. ==== Tests cases From 701c257a2a83eca7e84854097c30110087075cc6 Mon Sep 17 00:00:00 2001 From: Damir Vodenicarevic Date: Tue, 21 Nov 2023 11:28:10 +0100 Subject: [PATCH 2/2] simplify tests section --- documentation/specification.adoc | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/documentation/specification.adoc b/documentation/specification.adoc index 1f666ab..6f25188 100644 --- a/documentation/specification.adoc +++ b/documentation/specification.adoc @@ -431,11 +431,9 @@ This interface should be the only interface used by the caller to have a compreh For now, the interface is the same as the one of the accumulator. -=== E2E & Benchmarks +=== Tests -To ensure that our code is production-ready, we must prove unit tests and benchmarks. - -==== Tests cases +We provide the following tests: - A test with a simple set of key/value - A test with a big set of key/value