Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db: verify ingest sst bounds are suffixless or adjacent #4126

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

itsbilal
Copy link
Member

We now allow sstables to have range key bounds that have suffixes in them, during ingestion. This is to allow for Cockroach-side defragmentation of range keys. This change ensures that such suffixed range keys are adjacent and defrag cleanly with rangekeys in other ingested sstables, to not cause panics later on in the read path.

Fixes #3829.

@itsbilal itsbilal requested a review from a team as a code owner October 31, 2024 20:15
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@itsbilal itsbilal force-pushed the ingest-verify-defrag-bounds branch 2 times, most recently from 9d48d7e to c00702e Compare October 31, 2024 21:11
@itsbilal
Copy link
Member Author

itsbilal commented Nov 1, 2024

We might need to figure out a way to exclude disaggregated storage sstables from these assertions. Rangekeydels don't play well with the current logic, and are tripping up the metamorphic tests.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 2 unresolved discussions (waiting on @itsbilal and @sumeerbhola)


ingest.go line 245 at r1 (raw file):

// ingestLoad1 creates the FileMetadata for one file. This file will be owned
// by this store.
//

We should move the checking logic to a separate struct

type rangeKeyCheck struct {
  lastRangeKeySpan keyspan.Span
}

func (c *rangeKeyCheck) Check(span *keyspan.Span) error { .. }

We would have a checkNoSuffixGap constructor, or use rangeKeyCheck{} when we don't want any checks.

But I wonder if there is a better way than doing this here. Couldn't we run this check on the metas after we do ingestLoad1? I'm assuming the SmallestRangeKey and LargestRangeKey in each meta would be enough to check for gaps?


ingest.go line 258 at r1 (raw file):

	lastRangeKey keyspan.Span,
	disableRangeKeyChecks bool,
) (*fileMetadata, keyspan.Span, error) {

The Span return looks very random, we should at least name it lastRangeKeySpan

We now allow sstables to have range key bounds that have suffixes
in them, during ingestion. This is to allow for Cockroach-side
defragmentation of range keys. This change ensures that such
suffixed range keys are adjacent and defrag cleanly with rangekeys
in other ingested sstables, to not cause panics later on in the read
path.

Fixes cockroachdb#3829.
Copy link
Member Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: 0 of 5 files reviewed, all discussions resolved (waiting on @sumeerbhola)


ingest.go line 245 at r1 (raw file):

Previously, RaduBerinde wrote…

We should move the checking logic to a separate struct

type rangeKeyCheck struct {
  lastRangeKeySpan keyspan.Span
}

func (c *rangeKeyCheck) Check(span *keyspan.Span) error { .. }

We would have a checkNoSuffixGap constructor, or use rangeKeyCheck{} when we don't want any checks.

But I wonder if there is a better way than doing this here. Couldn't we run this check on the metas after we do ingestLoad1? I'm assuming the SmallestRangeKey and LargestRangeKey in each meta would be enough to check for gaps?

We can't look at just the metas, unfortunately. We need to ensure the RangeKeySets and Unsets in the []Key inside the bound spans are equivalent, otherwise we'd have to surface two different fragmented spans and one of those could have a suffix.

Did the struct change though


ingest.go line 258 at r1 (raw file):

Previously, RaduBerinde wrote…

The Span return looks very random, we should at least name it lastRangeKeySpan

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ingest: verify ingestion sstables have legal boundary range keys
3 participants