Releases · mjakubowski84/parquet4s

11 Aug 16:20

mjakubowski84

v0.8.0

8592556

v0.8.0

This release brings you two things:

IncrementalParquetWriter by @alexklibisz allows you to write data to a single file before you decide to close it.
Fix for a bug found by @darmbrus. He noticed that HadoopConfiguration was not processed by ParquetReader so passing options programmatically to the reader virtually took no effect. We are terribly sorry for that but it should be fine since now!

Assets 2

01 Aug 19:25

mjakubowski84

v0.7.0

76912ec

v0.7.0

This release focuses on providing developers better tooling for writing Parquet files with data coming from indefinite streams. By indefinite stream we mean one with a source that reads data e.g. from message broker (like Kafka). A new handy Sink is added to akka module with plenty of parameters.
And you have example application in project source, please have a read!

Assets 2

18 Jun 19:07

mjakubowski84

v0.6.0

3d1a707

v0.6.0

This small release contains just a single improvement, though some can find it useful. It allows developers to define Hadoop configuration programmatically. Thanks goes to @aravindr18 and @mac01021 who proposed and implemented the change.

Assets 2

12 May 11:43

mjakubowski84

v0.5.0

21a7125

v0.5.0

Version 0.5.0 brings many new changes, amongst all most notable is making the library more lightweight in terms of dependencies and allows (and forces) developers to define their own version of Hadoop.

Full list of changes:
Bug fixes

Thanks to @mac01021 we found and fixed the bug that in some cases prevented to resolve schemas with nested case classes and so it was impossible to write them to Parquet

Improvements:

In an answer to @rtoomey-coatue's request slf4j dependencies were cleaned and hadoop-client was marked as provided. It allows developers to define their own version of Hadoop and avoid solving dependency conflicts. The library is more lightweight now.
Support for following types was added:
- Byte
- Short
- Char
- BigDecimal

Many thanks to all contributors!

Assets 2

22 Feb 13:10

mjakubowski84

v0.4.0

d3eff95

v0.4.0

Thanks to @moonkev we obtained support for java.time.Date and java.time.DateTime (compatible with java.sql.* counterparts)
Ability to define time zone which is used to encode or decode time based values: like java.time.DateTime and java.sql.Timestamp

Assets 2

11 Feb 21:01

mjakubowski84

v0.3.0

9c7d53d

v0.3.0

This is big release that apart from introducing new main features aims on stability and improved compatibility with Apache Spark:

Most important feature: ability to write Parquet files both with core library and Akka Streams
Improved compatibility with Apache Spark Parquet: focus mostly on handling null values and scala.Option
Better support for complex trees of Scala collections
Custom implementation of handling java.sql.Date and java.sql.Timestamp, still compatible with Apache Spark but written in a custom way, no more need to incorporate Apache License notice
Improved type class definitions so that users do not have to import implicits
ParquetReader type class introduced

Assets 2

04 Jan 12:48

mjakubowski84

v0.2.0

594d54f

Integration with Akka Streams

Library divided into two modules.
First one is a core one that contains original Parquet reader.
Second one introduces integration with Akka Streams.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: mjakubowski84/parquet4s

v0.8.0

v0.7.0

v0.6.0

v0.5.0

v0.4.0

v0.3.0

Integration with Akka Streams