-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vendor kerchunk readers? #377
Comments
@martindurant offered to work on kerchunk <-> zarr-python v3 compatibility over in fsspec/kerchunk#516 (comment). While I understand there may be times in which vendoring is necessary, I'm a -1 on this approach unless we first try to resolve any communication gaps about the urgency of kerchunk adopting zarr-python v3 for downstream packages. @TomNicholas understanding that the ship has sailed for kerchunk compatibility prior to the zarr-python v3 release, can you provide a timeline on which you'd wait on Martin finishing the work started in fsspec/kerchunk#516? e.g., a day, one week, two weeks? This would help guide the conversation of whether an extreme solution like vendering most of kerchunk's reader code is necessary. Thanks for your continuous work on making sure this ecosystem moves as fast as possible! @martindurant, would you be able to offer some insight into your potential timeline for working on fsspec/kerchunk#516? Thank you for offering to take that up - it will be really impactful for people wanting to use Kerchunk and VirtualiZarr with Icechunk! |
I am assessing now. The combine module will take some time for sure, but that's not necessary for Vzarr. |
@maxrjones that's totally fair, and thank you for doing the work of reaching back out again here. I had not seen your comment on @mpiannucci 's branch (now subscribed) when I raised this issue - the last activity I had seen was in November. Obviously if @martindurant is able to get get kerchunk updated to work with zarr-python v3 then that would be perfect! The suggestion in this issue is that by vendoring (a relatively small amount of) code we could remove any time pressure on Martin, and move forwards ourselves.
Exactly - in VirtualiZarr we're currently blocked by components we don't actually need.
I'm unlikely to have time to actually write code for this library for the next 3 weeks, but after that I want to prioritize pushing it forwards, so that people can start getting their data into Icechunk without unnecessary friction. So let's say a month? If it doesn't work by then I would prefer to just vendor things so we can move forwards. Obviously if kerchunk compatibility with v3 is achieved sometime after that we can still easily un-vendor and re-introduce the optional dependency (which is the ideal situation). |
Closing with the plan to wait on a kerchunk release rather than vendoring the readers |
We still have dependency issues with kerchunk not supporting zarr-python v3, but icechunk requiring it (see #321). It unfortunately doesn't look like these will be resolved in kerchunk any time soon.
Right now we only actually use kerchunk code directly within VirtualiZarr for:
HDF
reader for)We really want to be able to pin
zarr-python>=3.0.0
, for so many reasons. But right now doing so would break those readers in VirtualiZarr because of the kerchunk incompatibility.I realised yesterday that the most expedient thing to do here might be to just vendor (i.e. copy-paste) the code for the FITS and netCDF3 readers into VirtualiZarr.
The implementations for these are:
That would allow us to go full steam ahead with all the other things we need to do to be able to pin
zarr-python>=3.0.0
(e.g. #374, #182, #175).FYI @bamford - I see you're the last person that committed to the FITS reader (fsspec/kerchunk#531), so you will want to be aware of us potentially forking it!
The text was updated successfully, but these errors were encountered: