Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

manifest: add range annotations #3759

Merged
merged 1 commit into from
Aug 14, 2024

Conversation

anish-shanbhag
Copy link
Contributor

@anish-shanbhag anish-shanbhag commented Jul 15, 2024

manifest: add range annotations

This change adds a "range annotation" feature to Annotators , which are computations that aggregate some value over a specific key range within a level. Range annotations use the same B-tree caching behavior as regular annotations, so queries remain fast even with thousands of tables because they avoid a sequential iteration over a level's files.

This PR only sets up range annotations without changing any existing behavior. See #3793 for some potential use cases.

BenchmarkNumFilesRangeAnnotation shows that range annotations are significantly faster than using version.Overlaps to aggregate over a key range:

pkg: github.com/cockroachdb/pebble/internal/manifest
BenchmarkNumFilesRangeAnnotation/annotator-10         	  306010	      4015 ns/op	      48 B/op	       6 allocs/op
BenchmarkNumFilesRangeAnnotation/overlaps-10          	    2223	    513519 ns/op	     336 B/op	       8 allocs/op

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@anish-shanbhag anish-shanbhag force-pushed the range-annotator branch 3 times, most recently from d1f0f5b to 35aa22e Compare July 15, 2024 18:15
@anish-shanbhag anish-shanbhag marked this pull request as ready for review July 15, 2024 18:23
@anish-shanbhag anish-shanbhag requested a review from a team as a code owner July 15, 2024 18:23
Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind extracting the conversion to generics into its own commit/PR so that it can be reviewed independently?

Reviewable status: 0 of 11 files reviewed, all discussions resolved (waiting on @itsbilal)

Copy link
Contributor Author

@anish-shanbhag anish-shanbhag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely - that PR is now at #3760. I'll rebase this one to just include the range annotation changes once that one is merged.

Reviewable status: 0 of 11 files reviewed, all discussions resolved (waiting on @itsbilal)

@anish-shanbhag anish-shanbhag marked this pull request as draft July 15, 2024 20:34
@anish-shanbhag anish-shanbhag changed the title manifest: refactor Annotator with generics and add range annotations manifest: add range annotations Jul 15, 2024
@anish-shanbhag anish-shanbhag force-pushed the range-annotator branch 3 times, most recently from b046563 to 282a3b1 Compare July 26, 2024 18:57
@anish-shanbhag anish-shanbhag marked this pull request as ready for review July 26, 2024 19:05
anish-shanbhag added a commit to anish-shanbhag/pebble that referenced this pull request Jul 30, 2024
cockroachdb#3760 contained a bug which causes Annotator
values to be incorrectly aggregated when pointer values
should be overwritten. This is because the `vtyped`
variable was not being modified due to being on the
stack.

This change fixes this and adds a unit test for
`PickFileAggregator` to catch this issue in the future.

cockroachdb#3759 should already not be affected by
this due to the different way it handles aggregation.
anish-shanbhag added a commit to anish-shanbhag/pebble that referenced this pull request Jul 30, 2024
cockroachdb#3760 contained a bug which causes Annotator
values to be incorrectly aggregated when pointer values
should be overwritten. This is because the `vtyped`
variable was not being modified due to being on the
stack.

This change fixes this and adds a unit test for
`PickFileAggregator` to catch this issue in the future.

cockroachdb#3759 should already not be affected by
this due to the different way it handles aggregation.
anish-shanbhag added a commit that referenced this pull request Jul 31, 2024
#3760 contained a bug which causes Annotator
values to be incorrectly aggregated when pointer values
should be overwritten. This is because the `vtyped`
variable was not being modified due to being on the
stack.

This change fixes this and adds a unit test for
`PickFileAggregator` to catch this issue in the future.

#3759 should already not be affected by
this due to the different way it handles aggregation.
Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

Reviewable status: 0 of 13 files reviewed, 1 unresolved discussion (waiting on @anish-shanbhag and @itsbilal)


internal/manifest/annotator.go line 104 at r3 (raw file):

	lowerBound []byte,
	// upperBound is a UserKeyBoundary that may be inclusive or exclusive.
	upperBound *base.UserKeyBoundary,

what impact does this have on the existing annotation benchmarks without any ranges?

I expect there is some overhead to combining these two routines, and we might be better off duplicating the function (using mutual recursion when an entire subtree is contained within the bounds). We're also less likely to accidentally begin caching an annotation that does not apply across the node's width.

I think we could also avoid adding a new scratch field to every annotation, and have the caller pass in a *T into which they want the value accumulated. Callers can avoid an allocation by allocating the T with their own data types.

Copy link
Contributor Author

@anish-shanbhag anish-shanbhag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 13 files reviewed, 1 unresolved discussion (waiting on @itsbilal and @jbowens)


internal/manifest/annotator.go line 104 at r3 (raw file):

Previously, jbowens (Jackson Owens) wrote…

what impact does this have on the existing annotation benchmarks without any ranges?

I expect there is some overhead to combining these two routines, and we might be better off duplicating the function (using mutual recursion when an entire subtree is contained within the bounds). We're also less likely to accidentally begin caching an annotation that does not apply across the node's width.

I think we could also avoid adding a new scratch field to every annotation, and have the caller pass in a *T into which they want the value accumulated. Callers can avoid an allocation by allocating the T with their own data types.

You're right, there was some extra overhead introduced:

goos: darwin
goarch: arm64
pkg: github.com/cockroachdb/pebble/internal/manifest
                     │     old     │                new                 │
                     │   sec/op    │   sec/op     vs base               │
NumFilesAnnotator-10   1.536µ ± 1%   1.664µ ± 1%  +8.33% (p=0.000 n=10)

                     │    old     │                new                │
                     │    B/op    │    B/op     vs base               │
NumFilesAnnotator-10   536.0 ± 0%   537.0 ± 0%  +0.19% (p=0.000 n=10)

                     │    old     │              new               │
                     │ allocs/op  │ allocs/op   vs base            │
NumFilesAnnotator-10   7.000 ± 0%   7.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

Agreed that it's better to separate the two functions - I've updated with that change, and the logic is definitely looking cleaner.

I realized that we can just use a single scratch field for the entire Annotator, rather than one for each annotation. My updated implementation takes that route, but let me know if you think there's still a benefit to allowing callers to pass a custom *T.

This change adds a "range annotation" feature to Annotators , which
are computations that aggregate some value over a specific key range within a level. Range annotations use the same B-tree caching behavior as regular annotations, so queries remain fast even with thousands of tables because they avoid a sequential iteration over a level's files.

This PR only sets up range annotations without changing any existing
behavior. See cockroachdb#3793 for some potential use cases.

`BenchmarkNumFilesRangeAnnotation` shows that range annotations are
significantly faster than using `version.Overlaps` to aggregate over
a key range:
```
pkg: github.com/cockroachdb/pebble/internal/manifest
BenchmarkNumFilesRangeAnnotation/annotator-10         	  306010	      4015 ns/op	      48 B/op	       6 allocs/op
BenchmarkNumFilesRangeAnnotation/overlaps-10          	    2223	    513519 ns/op	     336 B/op	       8 allocs/op
```
Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Nice!

Reviewed 1 of 13 files at r2, 1 of 2 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: 3 of 13 files reviewed, all discussions resolved (waiting on @itsbilal)

Copy link
Contributor Author

@anish-shanbhag anish-shanbhag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: 3 of 13 files reviewed, all discussions resolved (waiting on @itsbilal)

@anish-shanbhag anish-shanbhag merged commit a6d2952 into cockroachdb:master Aug 14, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants