Let writers specify the chunk size they want #26

pkhuong · 2024-01-22T02:02:06Z

The first two commits are small cleanups.

The next commit adds an optional base_chunk_size field in the ManifestV1 proto. When this new field is populated, it holds the base chunk size for the chunks referenced by that manifest: all but the last chunk must have that size (the final chunk may be smaller). This new proto field has the unused protobuf index 10.

Given this infrastructure, it's easy to make the writer's chunk size configurable (with an environment variable for now), and to pipe that to the manifest.

Tested with t/test.sh (i.e., large chunks of the sqlite test suite with validation of the replicated data).

src/manifest_schema.rs

…nitial Manifest has grown pretty big, as flagged by clippy. TESTED=it builds.

The code so far assumes that readers recognize the fingerprint for a full-sized chunks that's all zero filled. The chunk size is currently constant, 64 KiB. Decouple that constant from the read and write side logic, since it's really become part of the replication protocol. TESTED=partial test.sh

…buf index 10) Still with a default of 64 KB, except with a differently named constant. The (write) tracker's SNAPSHOT_GRANULARITY constant is now WRITE_SNAPSHOT_GRANULARITY, and we can see in the patch that we only had to update write-side code, or constants for performance heuristics. On the write side, we stick in the default `None`: this lets us (temporarily) test what would happen if a new reader accessed old manifests without this new field. TESTED=partial test.sh

Hide WRITE_SNAPSHOT_GRANULARITY behind a function, and rename the constant to `*DEFAULT_*WRITE_SNAPSHOT_GRANULARITY`, to enforce using the function. The Tracker object now keeps track of its snapshot granularity (i.e., chunk size, except for the last chunk which may be shorter) as a field... and pipes it all the way to the serialised ManifestV1. Manual inspection shows that the remaining reference changes (a lot of which also transitively turned into functions) are for performance heuristics (copier.rs, unzstd.rs), and thus also apply the current default as the min value, to avoid acting too weird when the actual chunk size is eventually configured much smaller than expected. TESTED=it builds

with the VERNEUIL_WRITE_SNAPSHOT_GRANULARITY environment variable. The value must be strictly positive, and ideally not too small; the code currently imposes a min value of 64 bytes. Currently an environment variable because there's no clear need for making this configurable via json, but shouldn't be too hard to fix: the Tracker object tracks the current value in an instance field. TESTED=test.sh, w/o size override and w/ export VERNEUIL_WRITE_SNAPSHOT_GRANULARITY={16384,196608}`

ahicks92 · 2024-02-06T21:36:00Z

I'll try to get back to this if @jpihlaja-bt doesn't beat me to it. What is the purpose, though? I don't understand what we gain unless it's that you want to change the sql page size and are hoping that chunks match it.

pkhuong · 2024-02-08T01:47:48Z

What is the purpose, though?

There's already only a weak relationship between SQLite's page size and Verneuil's chunk size: we must do the right thing for SQLite page size smaller than 64 KB (e.g., the default value). Making the chunk size tweakable (usually larger, I expect) lets us improve the aggregate cost of operations when IOPS are more expensive than storage; that's particularly interesting for read-heavy workloads on mostly static data, where we really want megabyte-sized chunks.

I think the final state might look like a special case for the root page, and smaller pages for B-tree leaves (e.g., after the first ~ sqrt(n) bytes of data), but a single tun-able is already useful.

pkhuong requested a review from jpihlaja-bt January 22, 2024 02:02

pkhuong commented Jan 22, 2024

View reviewed changes

src/manifest_schema.rs Show resolved Hide resolved

pkhuong force-pushed the pkhuong/variable-chunk-size branch from 8385db1 to c51f61d Compare January 22, 2024 02:23

pkhuong force-pushed the pkhuong/podman branch 2 times, most recently from 16d1f9a to 5dea95c Compare January 23, 2024 00:54

pkhuong added 5 commits January 22, 2024 19:54

tracker/snapshot_file_content: Box the Manifest in CurrentManifest::I…

7c11ffa

…nitial Manifest has grown pretty big, as flagged by clippy. TESTED=it builds.

pkhuong force-pushed the pkhuong/variable-chunk-size branch from c51f61d to 0678f43 Compare January 23, 2024 00:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let writers specify the chunk size they want #26

Let writers specify the chunk size they want #26

pkhuong commented Jan 22, 2024 •

edited

Loading

ahicks92 commented Feb 6, 2024

pkhuong commented Feb 8, 2024

Let writers specify the chunk size they want #26

Are you sure you want to change the base?

Let writers specify the chunk size they want #26

Conversation

pkhuong commented Jan 22, 2024 • edited Loading

ahicks92 commented Feb 6, 2024

pkhuong commented Feb 8, 2024

pkhuong commented Jan 22, 2024 •

edited

Loading