Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blockstore: Migrate ShredIndex type to more efficient data structure #3900

Merged
merged 52 commits into from
Jan 13, 2025

Conversation

cpubot
Copy link

@cpubot cpubot commented Dec 3, 2024

Problem

The current blockstore index type, backed by a BTreeSet, suffers from performance issues in serialization / deserialization due to its dynamically allocated and balanced nature. See #3570 for context.

Summary of Changes

Fixes #3570

This PR implements the ShredIndex behavior behind a bit vec (Vec<u8>).

Migration strategy

The general goal is to avoid the overhead of writing two separate columns while still supporting the downgrade path. While this can be accomplished with two distinct columns, for this proposal, rather than writing to a new column, I am proposing we add support for both formats in the same column. This will avoid an additional read to rocksdb, as we can attempt deserialization of both formats on the same rocksdb slice. In order to support the downgrade path, this initial PR will solely add support for reading the new format.

See release steps

The idea here is to split the column migration across three releases such that:

  1. Initial release simply adds support for reading the new format as a fallback case, and does no writing of the new format.
    • This lays the foundation for a downgrade. For example, assume operators have upgraded to release 2/3 in the chain (bullet point 2), and as such have been solely writing to the new format. In the event of a downgrade, release 1/3 still understands how to read the new format, while continuing to read and write the legacy version.
    • This ensures release 1/3 doesn't incur the overhead of serializing and writing the new format, but can still understand and use it in the event of a downgrade.
  2. This release reads and writes the new format as its primary target, yet still understands the legacy column for fallback reads (i.e., we swap the deserialization attempt order). It does no writing of the legacy format.
    • This instantiates the migration. We can safely downgrade to release 1 because it understands how to read the new format that was written in release 2.
  3. Once the release is considered stable and we don't anticipate a downgrade, we can remove support for the legacy format and its associated fallback reads.

@cpubot cpubot changed the title Shred index next Blockstore: Migrate ShredIndex type to more efficient data structure Dec 3, 2024
Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me. I like the phased adoption strategy you've proposed.

Left a couple nits and potential perf optimizations.

My only real concern is how tied we are to the current shred limit. It should be easy to transition to a world where shred limit is >32k

ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
@cpubot
Copy link
Author

cpubot commented Dec 6, 2024

Looks pretty good to me. I like the phased adoption strategy you've proposed.

Left a couple nits and potential perf optimizations.

My only real concern is how tied we are to the current shred limit. It should be easy to transition to a world where shred limit is >32k

This is @steviez's strategy for the record! Just my slightly tweaked (for my own clarity) summary of it.

Almost all of the code should be portable across >32k since MAX_U64S_PER_SLOT is a function of MAX_DATA_SHREDS_PER_SLOT, and everything is written against either MAX_U64S_PER_SLOT and MAX_DATA_SHREDS_PER_SLOT.
(Also just pushed a change to ensure we have an extra u64 slot available if MAX_DATA_SHREDS_PER_SLOT % 64 != 0).

I believe the only thing that would require change when we increase this limit is the deserialization function. Right now it's explicitly checking std::mem::size_of::<[u64; MAX_U64S_PER_SLOT]>, but this could easily be made more flexible when the limit is increased.

@cpubot cpubot requested a review from bw-solana December 6, 2024 09:04
bw-solana
bw-solana previously approved these changes Dec 6, 2024
Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@steviez steviez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will review this later today - just throwing this in here so I get a chance to do so before merging

Copy link

@steviez steviez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, things are looking good ! I have a bunch of minor comments, and two more major items.

Endianness - Our current situation is kind of all over the place:

  • We use big endian for keys
  • We use little endian for things that are serialized (little is default for bincode)

If we use native to write these, that might break some existing guarantees/workflows. One such case would be downloading rocksdb backups from our warehouse node uploads. Our nodes push up compressed rocksdb archives, so thing would obviously break if the native endianness different from the warehouse node and downloading node. I need to think about this aspect a bit more.

Migration strategy - In the steps I originally wrote up, I mentioned targeting v2.1 as the version to land the write-legacy-format-but-can-read-new-format change. Given that tip of master is currently v2.2, this means we'd have to BP. If we want to BP, I think we need to slim this PR down as much as possible given where v2.1 is in its' release cycle. That would mean including only what is necessary to deserialize into the new type + convert it into the old type (plus a compatibility test)

That being said, we should probably ensure we have some amount of buy in for the v2.1 BP before you go restructure stuff. @bw-solana, what are your thoughts on this ?

ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_db.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
Copy link
Member

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good

ledger/src/blockstore_meta.rs Show resolved Hide resolved
Copy link

@steviez steviez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor things, this is looking good in general tho and I think this is pretty close to being ready

ledger/src/blockstore_db.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
@alessandrod alessandrod self-requested a review January 10, 2025 01:57
Copy link

@alessandrod alessandrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, just a couple of nits

ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_db.rs Show resolved Hide resolved
@cpubot cpubot requested review from steviez and alessandrod January 11, 2025 17:25
Copy link

@steviez steviez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thought through the conversion failure cases independently, but the comments you added above each test were a good addition too.

Lastly, I see you added the v2.1 BP label. This would be a nice item to include so we can realize the gains sooner, but we are also kind of late in v2.1 release cycle. We can have that discussion in the v2.1 PR tho, let's get this merged

Comment on lines +371 to +372
#[allow(unused)]
pub(crate) fn contains(&self, idx: u64) -> bool {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could have been #[cfg(test)] but I'm fine with it as-is since this will become used very shortly

@cpubot cpubot merged commit f8e5b16 into anza-xyz:master Jan 13, 2025
58 checks passed
@cpubot cpubot deleted the shred-index-next branch January 13, 2025 14:04
mergify bot pushed a commit that referenced this pull request Jan 13, 2025
…3900)

(cherry picked from commit f8e5b16)

# Conflicts:
#	ledger/src/blockstore_db.rs
#	svm/examples/Cargo.lock
cpubot added a commit that referenced this pull request Jan 17, 2025
…3900)

(cherry picked from commit f8e5b16)

# Conflicts:
#	ledger/src/blockstore_db.rs
#	svm/examples/Cargo.lock
cpubot added a commit that referenced this pull request Jan 22, 2025
…cture (backport of #3900) (#4428)

* Blockstore: Migrate ShredIndex type to more efficient data structure (#3900)

(cherry picked from commit f8e5b16)

# Conflicts:
#	ledger/src/blockstore_db.rs
#	svm/examples/Cargo.lock

* fixup merge conflicts

* Delete svm/examples/Cargo.lock

* update lockfiles

---------

Co-authored-by: Zach Brown <[email protected]>
@cpubot cpubot mentioned this pull request Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v2.1 Backport to v2.1 branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Blockstore: Migrate ShredIndex type to more efficient data structure
8 participants