-
Notifications
You must be signed in to change notification settings - Fork 172
CStore File Layout
Hadi Moshayedi edited this page Aug 5, 2014
·
3 revisions
Each table has two files:
- Table data file. which contains the table data, and table skip lists which can be used for skipping blocks that doesn't match the query’s WHERE clause. If "filename" option is specified for the foreign table, then the data file is created in the specified filename. Otherwise, it is automatically created at $PGDATA/$dboid/$relfilenode.
- Table footer file. which contains the file offsets and lengths of each of each of the stripes in the data file. The filename for footer file is created by appending ".footer" suffix to the data filename.
The layout of data file can be summarized in the following diagram:
Row Stripes. Data in data file is divided into row stripes. The number of rows per each row stripe is configurable by stripe_row_count fdw option. Each Stripe has three sections:
- Stripe Skip List. contains statistics (min/max values, and positions. please see the structs in cstore_fdw.h) for each column block in stripe, so we can skip reading those blocks that are refuted by WHERE clauses.
- Stripe Data. for each column block we store two streams: "exists" stream, "value" stream. "exists" is a boolean stream which tells which values are not null. "value" stream contains values for non-NULL values. If compression is enabled, the "value" stream gets compressed. To enable compression, you can set "compression" fdw option to "pglz". cstore_fdw uses PostgreSQL's "Datum" presentation to store values in disk.
- Stripe Footer. contains the lengths of each of the streams in "Stripe Skip List" and "Stripe Data".
The footer file has three sections:
- Table Footer. which contains the file offset, and lengths of different parts of each stripe. This can be used to start reading a stripe.
- Postscript. contains the length of "Table Footer", and file signature and version.
- Postscript Size. the last byte of the file, which can be used to start reading the Postscript.
Please refer to cstore_fdw.h and cstore.proto to get a better idea about the layout of the file.