Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor of initial plugin #1

Merged
merged 4 commits into from
Sep 2, 2024
Merged

Conversation

jagedn
Copy link
Contributor

@jagedn jagedn commented Aug 30, 2024

  • upgrade to nextflow 24
  • use carpet lib
  • refactor splitParquet
  • some validation use cases

This is an initial upgrade of the plugin using an open source library (carpet) to read/write parquet files avoiding the complexity of using low level library.

Carpet can read parquet files as Map but also you can provide a java Record class as projection so the library read only field presents in the Record

This PR contains a validation folder, with a few small parquet files, to validate the plugin (tests will be added in next PRs) so you can see in action following these steps:

./gradlew copyPluginZip
cd validation
NXF_PLUGINS_DIR=$(pwd)/../build/plugins nextflow run read_raw.nf 
NXF_PLUGINS_DIR=$(pwd)/../build/plugins nextflow run read.nf 
NXF_PLUGINS_DIR=$(pwd)/../build/plugins nextflow run read_complex.nf 

- upgrade to nextflow 24
- use carpet lib
- refactor splitParquet
- some validation use cases
@pditommaso
Copy link
Member

Great Jorge, the README is still valid?

@jagedn
Copy link
Contributor Author

jagedn commented Sep 2, 2024

Great Jorge, the README is still valid?

Yes, the method names are the same (although write method is empty by the moment)

@pditommaso pditommaso merged commit c87b67b into nextflow-io:master Sep 2, 2024
1 check passed
@pditommaso
Copy link
Member

Awesome!

@jerolba
Copy link

jerolba commented Sep 3, 2024

@jagedn FYI: just yesterday I released version 0.2.0:
https://mvnrepository.com/artifact/com.jerolba/carpet-record/0.2.0
Relevant changes :

  • upgrade to Parquet 1.14.2
  • review transistive dependencies exclusions if you are not using hadoop (reduces 2.5MB)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants