Skip to content

Column Store Adaptivity

Anil Shanbhag edited this page Nov 10, 2015 · 6 revisions

Many companies like Facebook/Bing use block-based storage system for their data warehouse. Many other co's use column stores as their data warehouse. The goal is figure what it means to have adaptive column stores.

Parquet is the columnar storage format used. It is marketed as "column store for hadoop".

Q. Figure out how Parquet file ends up stored on HDFS. It seems like we still need to organize it as blocks.

Presentation that described the storage format of Parquet. link

Clone this wiki locally