Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor piece and segment downloading into subspace-data-retrieval #3062

Merged
merged 9 commits into from
Oct 1, 2024

Conversation

teor2345
Copy link
Contributor

@teor2345 teor2345 commented Sep 25, 2024

Purpose

This PR implements piece and segment fetching and reconstruction in the subspace-data-retrieval crate, so we can use that code to fetch objects.

Change List

This PR refactors piece and segment downloading into the subspace-data-retrieval crate.

It also:

  • adds a segment piece downloading function
  • splits out segment reconstruction code into its own function
  • combines the DsnSyncPieceGetter and ObjectPieceGetter traits
  • implements segment and piece downloading for ObjectFetcher
  • fixes some bugs in piece and segment index calculations

And makes some minor cleanups:

  • removes some debugging code that would cause a dependency on subspace-networking
  • fixes some error handling bugs
  • adds tracing logging

Out of Scope

It doesn't implement piece retries yet, I think that can go in another PR.

Testing

I've manually tested this PR with a proof of concept object fetching RPC on a local node in dev mode. The objects come back correctly and pass a hash check. (I haven't tested anything large or over multiple segments yet.)

Code contributor checklist:

@teor2345 teor2345 added the enhancement New feature or request label Sep 25, 2024
@teor2345 teor2345 self-assigned this Sep 25, 2024
Copy link
Contributor Author

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes from review with Nazar:

  • don't change subspace-service
  • don't refactor, just copy the code (and note where it came from)

crates/subspace-service/src/sync_from_dsn.rs Outdated Show resolved Hide resolved
shared/subspace-data-retrieval/src/segment_fetcher.rs Outdated Show resolved Hide resolved
@teor2345 teor2345 force-pushed the piece-download-refactor branch 3 times, most recently from 3dc1ad0 to ee6eb5a Compare September 27, 2024 01:07
@teor2345
Copy link
Contributor Author

During my manual testing, I found some bugs in piece and segment index calculations, and added some trace logging.

We can remove the logging later, once we've finished automated testing.

shared/subspace-data-retrieval/src/object_fetcher.rs Outdated Show resolved Hide resolved
Comment on lines +192 to +197
tracing::debug!(
%piece_index,
piece_offset,
RawRecord_SIZE = RawRecord::SIZE,
"Invalid piece offset for object: must be less than the size of a raw record",
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we don't need these logs if we already return an error explicitly. It balloons the code side and brings little to no benefit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the logs don’t need to be in there long-term, but I’d like to keep them temporarily while I’m testing and debugging this code.

shared/subspace-data-retrieval/src/piece_fetcher.rs Outdated Show resolved Hide resolved
shared/subspace-data-retrieval/src/piece_fetcher.rs Outdated Show resolved Hide resolved
shared/subspace-data-retrieval/src/piece_getter.rs Outdated Show resolved Hide resolved
shared/subspace-data-retrieval/src/piece_getter.rs Outdated Show resolved Hide resolved
shared/subspace-data-retrieval/src/segment_fetcher.rs Outdated Show resolved Hide resolved
Copy link
Contributor Author

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, just some follow-up questions.

Comment on lines +192 to +197
tracing::debug!(
%piece_index,
piece_offset,
RawRecord_SIZE = RawRecord::SIZE,
"Invalid piece offset for object: must be less than the size of a raw record",
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the logs don’t need to be in there long-term, but I’d like to keep them temporarily while I’m testing and debugging this code.

shared/subspace-data-retrieval/src/piece_getter.rs Outdated Show resolved Hide resolved
Comment on lines +77 to +78
// We want exact pieces, so any errors are fatal.
let received_pieces: Vec<Piece> = received_pieces.try_collect().await?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will work for now, but long-term this generic function isn't very useful IMO because you will want to do reconstruction of missing pieces instead of failing altogether here. We already do this in farmer for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll put that on the list.

@teor2345 teor2345 added this pull request to the merge queue Oct 1, 2024
Merged via the queue into main with commit fe64ec3 Oct 1, 2024
11 checks passed
@teor2345 teor2345 deleted the piece-download-refactor branch October 1, 2024 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants