Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[no-release-notes] go/store/nbs: During a GC process, take dependencies on chunks that are read through the ChunkStore. #8760

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

reltuk
Copy link
Contributor

@reltuk reltuk commented Jan 16, 2025

Previously GC was constructed to walk the transitive closure of reachable chunks from when it started, and to take additional dependencies on any chunks that were written to the ChunkStore during the GC process. This change makes it so that we can additionally take dependencies on any chunks that are read from the ChunkStore during the GC process.

For the nbs layer to signal a new dependency during a GC, it makes use of a keeperFunc, which is passed into the BeginGC call. The contract is that, if keeperFunc is non-nil and it returns false, the chunk address which was given to the function will make its way back to a call to SaveHashes on some MarkAndSweeper whose contents will end up in the final store once the GC is completed. keeperFunc, however, is allowed to return true, which signals to the block store that the in-progress GC has passed a certain point in time and can no longer make that guarantee. As a result, anytime keeperFunc returns true, the block store implementation needs to block until the GC process is over, and then it should resume the operation from the beginning.

That machinery already exists for the Put() and Commit() use case. This PR adds machinery to do all of that on various read paths that explicitly touch chunks, such as Has, HasMany, Get, GetMany and GetManyCompressed. In order to add dependency tracking on the read path, it's important for the GC process itself to not cause the chunks it has read to form additional dependencies. This PR makes a slight change to markAndSweeper so that it can fetch chunks from the source store without recording unnecessary duplicate dependencies and potentially becoming mired in deadlocks.

…equests so that a GC does not end while they are in progress.
…chunks which were in the memtable but get filtered out because they are already in the store.
…h for chunks that are written but which are already present in the store.
…g, make it so that reading chunks as part of the GC process does not take further dependencies on them and never blocks on waitForGC.
@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
e423a5c ok 5937457
version total_tests
e423a5c 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
6ad9b37 ok 5937457
version total_tests
6ad9b37 5937457
correctness_percentage
100.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants