Releases: mjakubowski84/parquet4s
v0.8.0
This release brings you two things:
- IncrementalParquetWriter by @alexklibisz allows you to write data to a single file before you decide to close it.
- Fix for a bug found by @darmbrus. He noticed that HadoopConfiguration was not processed by ParquetReader so passing options programmatically to the reader virtually took no effect. We are terribly sorry for that but it should be fine since now!
v0.7.0
This release focuses on providing developers better tooling for writing Parquet files with data coming from indefinite streams. By indefinite stream we mean one with a source that reads data e.g. from message broker (like Kafka). A new handy Sink is added to akka
module with plenty of parameters.
And you have example application in project source, please have a read!
v0.6.0
This small release contains just a single improvement, though some can find it useful. It allows developers to define Hadoop configuration programmatically. Thanks goes to @aravindr18 and @mac01021 who proposed and implemented the change.
v0.5.0
Version 0.5.0 brings many new changes, amongst all most notable is making the library more lightweight in terms of dependencies and allows (and forces) developers to define their own version of Hadoop.
Full list of changes:
Bug fixes
- Thanks to @mac01021 we found and fixed the bug that in some cases prevented to resolve schemas with nested case classes and so it was impossible to write them to Parquet
Improvements:
- In an answer to @rtoomey-coatue's request
slf4j
dependencies were cleaned andhadoop-client
was marked asprovided
. It allows developers to define their own version of Hadoop and avoid solving dependency conflicts. The library is more lightweight now. - Support for following types was added:
- Byte
- Short
- Char
- BigDecimal
Many thanks to all contributors!
v0.4.0
v0.3.0
This is big release that apart from introducing new main features aims on stability and improved compatibility with Apache Spark:
- Most important feature: ability to write Parquet files both with core library and Akka Streams
- Improved compatibility with Apache Spark Parquet: focus mostly on handling null values and scala.Option
- Better support for complex trees of Scala collections
- Custom implementation of handling java.sql.Date and java.sql.Timestamp, still compatible with Apache Spark but written in a custom way, no more need to incorporate Apache License notice
- Improved type class definitions so that users do not have to import implicits
- ParquetReader type class introduced
Integration with Akka Streams
Library divided into two modules.
First one is a core one that contains original Parquet reader.
Second one introduces integration with Akka Streams.