Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change how snapshot handles flushing of write buffer #392

Closed
dcoutts opened this issue Sep 13, 2024 · 3 comments · Fixed by #478
Closed

Change how snapshot handles flushing of write buffer #392

dcoutts opened this issue Sep 13, 2024 · 3 comments · Fixed by #478
Assignees

Comments

@dcoutts
Copy link
Collaborator

dcoutts commented Sep 13, 2024

Currently, snapshot will mutate the given handle by flushing the write buffer, before trying to take a snapshot of all the runs.

This has the effect of invalidating all blob references, which is surprising and does not correspond to the spec/model.

A plausible solution would be to do it like this: flush the write buffer to a new run, but do not modify the table handle to use that new run. Leave the table handle as-is (thus not invalidating blob refs). Then save all the runs (including the one flushed from the write buffer) into the snapshot.

Difficulties: flushing one more run may necessitate kicking off more merges, otherwise it would leave an over-full level 1. Perhaps this is ok to do for a snapshot, and the merging can be performed upon restoring the snapshot. In other words don't enforce (or explicitly relax) the levels shape invariant (which checks the number of runs in a level) for a snapshot, but enforce it upon restoration by starting the appropriate merges.

In addition, the current snapshot code modifies the table handle state before it checks if the snapshot name is already taken or not, so even if snapshot fails then there are state changes that invalidate the blob references.

The code should be adjusted to open the snapshot file first (thus doing the duplicate name check) and then do the rest of the work.

When this is done, the state machine model can be simplified. See Database.LSMTree.Model.Normal.Session which references this issue.

dcoutts added a commit that referenced this issue Sep 17, 2024
Only one fix in the code is needed, for the NoThunks test.

The state machine test found two non-trivial differences between the
spec and the model in relation to blobs:
1. blob retrieval after snapshot fails
2. blob retrieval after failed-snapshot (due to duplicate name) fails

The reason for both of these is that the implementation currently works
by modifying flushing the write buffer to disk, and modifing the table
handle to use the updated set of runs and now-empty write buffer. This
invalidates all blob references from the write buffer itself.

For the moment I have "fixed" this difference by changing the model to
behave like the implementation. For the TODO task to fix this, see
#392
dcoutts added a commit that referenced this issue Sep 17, 2024
Only one fix in the code is needed, for the NoThunks test.

The state machine test found two non-trivial differences between the
spec and the model in relation to blobs:
1. blob retrieval after snapshot fails
2. blob retrieval after failed-snapshot (due to duplicate name) fails

The reason for both of these is that the implementation currently works
by modifying flushing the write buffer to disk, and modifing the table
handle to use the updated set of runs and now-empty write buffer. This
invalidates all blob references from the write buffer itself.

For the moment I have "fixed" this difference by changing the model to
behave like the implementation. For the TODO task to fix this, see
#392
dcoutts added a commit that referenced this issue Sep 18, 2024
Only one fix in the code is needed, for the NoThunks test.

The state machine test found two non-trivial differences between the
spec and the model in relation to blobs:
1. blob retrieval after snapshot fails
2. blob retrieval after failed-snapshot (due to duplicate name) fails

The reason for both of these is that the implementation currently works
by modifying flushing the write buffer to disk, and modifing the table
handle to use the updated set of runs and now-empty write buffer. This
invalidates all blob references from the write buffer itself.

For the moment I have "fixed" this difference by changing the model to
behave like the implementation. For the TODO task to fix this, see
#392
dcoutts added a commit that referenced this issue Sep 18, 2024
Only one fix in the code is needed, for the NoThunks test.

The state machine test found two non-trivial differences between the
spec and the model in relation to blobs:
1. blob retrieval after snapshot fails
2. blob retrieval after failed-snapshot (due to duplicate name) fails

The reason for both of these is that the implementation currently works
by modifying flushing the write buffer to disk, and modifing the table
handle to use the updated set of runs and now-empty write buffer. This
invalidates all blob references from the write buffer itself.

For the moment I have "fixed" this difference by changing the model to
behave like the implementation. For the TODO task to fix this, see
#392
dcoutts added a commit that referenced this issue Sep 18, 2024
Only one fix in the code is needed, for the NoThunks test.

The state machine test found two non-trivial differences between the
spec and the model in relation to blobs:
1. blob retrieval after snapshot fails
2. blob retrieval after failed-snapshot (due to duplicate name) fails

The reason for both of these is that the implementation currently works
by modifying flushing the write buffer to disk, and modifing the table
handle to use the updated set of runs and now-empty write buffer. This
invalidates all blob references from the write buffer itself.

For the moment I have "fixed" this difference by changing the model to
behave like the implementation. For the TODO task to fix this, see
#392
@jorisdral jorisdral changed the title TODO: change how snapshot handles flushing of write buffer Change how snapshot handles flushing of write buffer Sep 23, 2024
@jorisdral
Copy link
Collaborator

...

In addition, the current snapshot code modifies the table handle state before it checks if the snapshot name is already taken or not, so even if snapshot fails then there are state changes that invalidate the blob references.

The code should be adjusted to open the snapshot file first (thus doing the duplicate name check) and then do the rest of the work.

...

This part is already done

wenkokke added a commit that referenced this issue Dec 10, 2024
@wenkokke
Copy link
Collaborator

wenkokke commented Jan 3, 2025

Currently, snapshot will mutate the given handle by flushing the write buffer, before trying to take a snapshot of all the runs. This has the effect of invalidating all blob references, which is surprising and does not correspond to the spec/model.

PR #478 changed the behaviour of snapshots to serialise/deserialise the write buffer to/from disk.

@wenkokke
Copy link
Collaborator

wenkokke commented Jan 3, 2025

When this is done, the state machine model can be simplified. See Database.LSMTree.Model.Normal.Session which references this issue.

And a018443 simplified the model.

@jorisdral jorisdral linked a pull request Jan 3, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants