-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conditionally use sequential stored field reader in LuceneSyntheticSourceChangesSnapshot #121636
Changes from 3 commits
6eb67b2
41d4e68
8e9436c
99623e7
161a429
1d0a210
c6cc1e7
78f0b7e
0f8dca1
9a7a3e7
447de83
31b81c0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -83,7 +83,15 @@ public LuceneSyntheticSourceChangesSnapshot( | |
this.maxMemorySizeInBytes = maxMemorySizeInBytes > 0 ? maxMemorySizeInBytes : 1; | ||
this.sourceLoader = mapperService.mappingLookup().newSourceLoader(null, SourceFieldMetrics.NOOP); | ||
Set<String> storedFields = sourceLoader.requiredStoredFields(); | ||
this.storedFieldLoader = StoredFieldLoader.create(false, storedFields); | ||
|
||
// If more than a few ops are requested let's enforce a sequential reader. Typically, thousands of ops may be requested. | ||
// Given how LuceneSyntheticSourceChangesSnapshot accesses stored field, it should always benefit from using sequential reader. | ||
// | ||
// A sequential reader decompresses a block eagerly, so that increasing adjacent doc ids can access stored fields without | ||
// compressing on each StoredFields#document(docId) invocation. The only downside is the last few operations in the request | ||
// seq_no range are at the beginning of a block, which means stored fields for many docs are being decompressed that isn't used. | ||
boolean shouldForceSequentialReader = toSeqNo - fromSeqNo > 10; | ||
this.storedFieldLoader = StoredFieldLoader.create(false, storedFields, shouldForceSequentialReader); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we should use always force a sequential reader. I think no matter the stored field format, it has benefits given how There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not entirely sure. Does this not depend upon how jumpy the data is, which with index sorting could be very jumpy - and whether there is more than one stored field per doc (not sure that is always true?)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, this is why then changed to logic not look at index.codec setting anymore. |
||
this.lastSeenSeqNo = fromSeqNo - 1; | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true also for zstd? Or rather would that always decompress the entire thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it both true for zstd or deflate, since both de-compress an entire block.