Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split optional dependencies in pyproject.toml #309

Merged
merged 18 commits into from
Feb 3, 2025
Merged

Conversation

TomNicholas
Copy link
Member

@TomNicholas TomNicholas commented Nov 19, 2024

Takes the splitting of optional dependencies idea started in #87 to its logical conclusion.

  • Changes are documented in docs/releases.rst
  • New functionality has documentation

pyproject.toml Outdated
Comment on lines 72 to 74
all = [
"virtualizarr[all_readers]",
"virtualizarr[icechunk]",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I don't think this can possibly work, because icechunk requires zarr-python v3 but kerchunk requires <3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we should change the public pip install to use some random branch...

Aside: Is there a reason you are using virtualizarr main there? Would it be helpful to issue another release?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we should change the public pip install to use some random branch...

sorry, I wasn't suggesting you should use that approach, but just was sharing for information about what is/isn't possible. I should've been more clear.

Aside: Is there a reason you are using virtualizarr main there? Would it be helpful to issue another release?

I wanted strict type hinting (#306)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I don't think this can possibly work, because icechunk requires zarr-python v3 but kerchunk requires <3.

Now that #412 is in this can work again, so we should push this PR forward too. But I'm not sure whether we should do this one before or after your Pixi PR #407 @maxrjones ?

Copy link
Member Author

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of this is currently tested because the CI uses conda envs rather than pip to install deps. We could change that though (FYI @maxrjones).

However we don't really want one env and corresponding CI job for every single reader, it's fine to test them all at once, but that means we still wouldn't really be testing the ability to do pip install virtualizarr[netcdf3] specifically for example.

Also @norlandrhagen fastparquet is in the CI but doesn't seem to be used anywhere?

@maxrjones
Copy link
Member

maxrjones commented Nov 19, 2024

I could change the CI to use pip. I have been using uv a bit recently and it's been super promising for managing deps, but would be a bigger change.

@cisaacstern
Copy link
Collaborator

without much context, obligatory pixi endorsement comment

@TomNicholas
Copy link
Member Author

I think the CI failure is because the mypy CI job is apparently the only one that uses pip to install currently

@norlandrhagen
Copy link
Collaborator

norlandrhagen commented Nov 19, 2024

Also @norlandrhagen fastparquet is in the CI but doesn't seem to be used anywhere?

I think it's a hidden requirement of using the Kerchunk parquet format.

@TomNicholas
Copy link
Member Author

I think it's a hidden requirement of using the Kerchunk parquet format.

So does that mean we should have another writer split out in these dependencies?

@maxrjones
Copy link
Member

maxrjones commented Nov 19, 2024

without much context, obligatory pixi endorsement comment

emerging from a rabbit hole (thanks @cisaacstern 😂) I'm floating around this decision framework for feedback:

  • If you're starting a new Python package, use uv
  • If you need to manage multiple Python virtual environments at a time, also use hatch
  • If you need Conda-Forge in addition to PyPI distributions, also use pixi

More on topic, feel welcome to ping me if you'd like to assign any CI / packaging updates.

@norlandrhagen
Copy link
Collaborator

So does that mean we should have another writer split out in these dependencies?

Probably! Also, maybe for reading existing Kerchunk parquet refs.

@TomNicholas
Copy link
Member Author

Probably! Also, maybe for reading existing Kerchunk parquet refs.

Does reading existing Kerchunk parquet refs require the same additional dependencies? If so we could have kerchunk_format = ["fastparquet"] as another group?

@norlandrhagen
Copy link
Collaborator

Does reading existing Kerchunk parquet refs require the same additional dependencies? If so we could have kerchunk_format = ["fastparquet"] as another group?

I just checked, both reading the writing of Kerchunk Parquet refs require fastparquet.

@TomNicholas TomNicholas mentioned this pull request Nov 20, 2024
@TomNicholas
Copy link
Member Author

In a633a0c I embedded the pyproject.toml into the installation docs page so that the list of pip-installable optional dependencies always stays up-to-date. See

https://virtualizarr--309.org.readthedocs.build/en/309/installation.html

@TomNicholas
Copy link
Member Author

note to self: add various parallel options to the splitting here once #349 is in, because that adds optional dependencies on lithops and dask

@maxrjones
Copy link
Member

FYI there's some redundancy between this PR and #407

@TomNicholas
Copy link
Member Author

FYI there's some redundancy between this PR and #407

Yep - which one shall we try to merge first? I think this one can be done very quickly, but I don't know if it will fully work without what you're doing in #407?

@maxrjones
Copy link
Member

FYI there's some redundancy between this PR and #407

Yep - which one shall we try to merge first? I think this one can be done very quickly, but I don't know if it will fully work without what you're doing in #407?

I recommend merging this first. It should not depend on anything in #407.

Copy link

codecov bot commented Feb 3, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.01%. Comparing base (14f58ba) to head (d79e348).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #309      +/-   ##
==========================================
- Coverage   84.17%   84.01%   -0.17%     
==========================================
  Files          31       31              
  Lines        1839     1839              
==========================================
- Hits         1548     1545       -3     
- Misses        291      294       +3     

see 1 file with indirect coverage changes

pyproject.toml Outdated Show resolved Hide resolved
Co-authored-by: Max Jones <[email protected]>
pyproject.toml Outdated
Comment on lines 62 to 71
# un-implemented readers
# tiff = [
# "virtualizarr[remote]",
# "kerchunk>=0.2.8",
# "tifffile",
# ]
# grib = [
# "virtualizarr[remote]",
# "kerchunk>=0.2.8",
# ]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These commented-out lines are going to appear in the docs - https://virtualizarr--309.org.readthedocs.build/en/309/installation.html#optional-dependencies.

Should I (a) remove them or (b) add issue links to encourage people to contribute those readers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove them because I don't think it's a certainty that those readers will be built based on kerchunk and its dependencies. e.g., I have it on my roadmap for the next 2 months to implement a reader based on https://github.com/developmentseed/aiocogeo-rs/.

@TomNicholas TomNicholas merged commit 24a4582 into main Feb 3, 2025
11 of 12 checks passed
@TomNicholas TomNicholas deleted the split_optional_deps branch February 3, 2025 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Updates a dependency Packaging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants