-
Notifications
You must be signed in to change notification settings - Fork 457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sstable: flush value blocks if 8MB are buffered #3188
base: master
Are you sure you want to change the base?
Conversation
1641955
to
5aae780
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break EstimateDiskUsage
which assumes that the data blocks are contiguous, and interpolates the size contribution of the value blocks. There is a TODO to improve the situation but it will need a format change
Lines 1106 to 1137 in ccb96e5
includeInterpolatedValueBlocksSize := func(dataBlockSize uint64) uint64 { | |
// INVARIANT: r.Properties.DataSize > 0 since startIdxIter is not nil. | |
// Linearly interpolate what is stored in value blocks. | |
// | |
// TODO(sumeer): if we need more accuracy, without loading any data blocks | |
// (which contain the value handles, and which may also be insufficient if | |
// the values are in separate files), we will need to accumulate the | |
// logical size of the key-value pairs and store the cumulative value for | |
// each data block in the index block entry. This increases the size of | |
// the BlockHandle, so wait until this becomes necessary. | |
return dataBlockSize + | |
uint64((float64(dataBlockSize)/float64(r.Properties.DataSize))* | |
float64(r.Properties.ValueBlocksSize)) | |
} | |
if endIdxIter == nil { | |
// The range spans beyond this file. Include data blocks through the last. | |
return includeInterpolatedValueBlocksSize(r.Properties.DataSize - startBH.Offset), nil | |
} | |
key, val = endIdxIter.SeekGE(end, base.SeekGEFlagsNone) | |
if key == nil { | |
if err := endIdxIter.Error(); err != nil { | |
return 0, err | |
} | |
// The range spans beyond this file. Include data blocks through the last. | |
return includeInterpolatedValueBlocksSize(r.Properties.DataSize - startBH.Offset), nil | |
} | |
endBH, err := decodeBlockHandleWithProperties(val.InPlaceValue()) | |
if err != nil { | |
return 0, errCorruptIndexEntry | |
} | |
return includeInterpolatedValueBlocksSize( | |
endBH.Offset + endBH.Length + blockTrailerLen - startBH.Offset), nil |
Is cockroachdb/cockroach#117122 an adequate work around in the absence of online restore? For backups in online restore, is there any benefit to making file sizes smaller so that they often don't span range boundaries (I realize we will virtualize and can share files across ranges)? With online restore, we could benefit from value blocks until the file is materialized locally, but we could also live without that optimization.
Reviewed 1 of 6 files at r1.
Reviewable status: 1 of 6 files reviewed, 2 unresolved discussions (waiting on @dt)
sstable/options.go
line 277 at r1 (raw file):
// presence of prefetching/read-ahead, page caching, etc. // // A value of 0 implies the default of max(8MB/BlockSize, 16) while a value of
I am curious about the 8MB default. Versus something like 64MB, which would allow most Pebble written files to be unaffected by this change.
sstable/options.go
line 282 at r1 (raw file):
} func (o WriterOptions) ensureDefaults() WriterOptions {
sstable.WriterOptions
are usually generated using pebble.MakeWriterOptions
(including in CRDB's sst_writer.go), so we need to add this ValueBlockBufferLimit
to the pebble.Options
so that it is actually possible to configure it.
No description provided.