forked from cockroachdb/pebble
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This change adds a "range annotation" feature to Annotators , which are computations that aggregate some value over a specific key range within within a level. Level-wide annotations are now computed internally as a range annotation with a key range spanning the whole level. Range annotations use the same B-tree caching behavior as regular annotations, so queries remain fast even with thousands of tables because they avoid a sequential iteration over a level's files. This PR only sets up range annotations without changing any existing behavior. However, there are a number of potential use cases for range annotations which could be added next: - Calculating the number of keys shadowed by a tombstone-dense key range, for use in the heuristic proposed at cockroachdb#3719 - Computing the total file size that a read compaction overlaps with, which is used to prevent read compactions that are too wide [here](https://github.com/cockroachdb/pebble/blob/9a4ea4dfc5a8129937e3fdc811ea87543d88565b/compaction_picker.go#L1930) - Estimating disk usage for a key range without having to iterate over files, which is done [here](https://github.com/jbowens/pebble/blob/master/db.go#L2249) - Calculating average value size and compression ratio for a key range, which we [currently use when estimating the potential space that compacting point tombstones would reclaim](https://github.com/jbowens/pebble/blob/646c6bab1af3c72dc7db59a0dcc38b5955fc15cc/table_stats.go#L350). Range annotations could also be used to implement the TODO from @jbowens. - Estimating the reclaimed space from compacting range deletions, for which we also [currently use sequential iteration](https://github.com/jbowens/pebble/blob/master/table_stats.go#L557). - Because annotations are in-memory, if we can find a way to refactor those last two without using I/O at all, then this would eliminate the need to defer table stats collection to a separate goroutine for newly written tables. - Refactoring [`db.ScanStatistics`](https://github.com/jbowens/pebble/blob/646c6bab1af3c72dc7db59a0dcc38b5955fc15cc/db.go#L2823), which could increase performance significantly over the current implementation where we scan every key in the DB. - Expand the LSM visualizer tool (cockroachdb#508) or the LSM viewer (cockroachdb#3339) to show aggregate statistics about key ranges. Range annotations would allow us to efficiently compute statistics including the # of sstables, # of keys, etc. in chunks of the keyspace and visualize this on a graph showing overlapping ranges from each level. `BenchmarkNumFilesRangeAnnotation` shows that range annotations are significantly faster than using `version.Overlaps` to aggregate over a key range: ``` pkg: github.com/cockroachdb/pebble/internal/manifest BenchmarkNumFilesRangeAnnotation/annotator-10 232282 4716 ns/op 112 B/op 7 allocs/op BenchmarkNumFilesRangeAnnotation/overlaps-10 2110 545482 ns/op 400 B/op 9 allocs/op``` ```
- Loading branch information
1 parent
7920d96
commit 79bdbc1
Showing
13 changed files
with
648 additions
and
584 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.