Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change: is xarray the right data back-end? #1

Open
jmp75 opened this issue Aug 26, 2024 · 0 comments
Open

change: is xarray the right data back-end? #1

jmp75 opened this issue Aug 26, 2024 · 0 comments

Comments

@jmp75
Copy link
Member

jmp75 commented Aug 26, 2024

Is your change request related to a problem? Please describe.

It is relatively painless to read the EFTS format from netcdf and wrangle it to a pleasant xarray to use. However creating a new xarray data set proves rather awkward, at least but not only when following a typical creation workflow as it was done with netcdf4 bindings in R and matlab.

All values of the data dimensions need to be known upfront and created before the dataset. Not entirely a problem, but this is a straight jacket.

To write coordinate values after data creation you cannot set the numpy values; you have to sue assign_coords, which creates a while new dataset in memory SFAICS. This is already potentially a problem when loading from disk then overriding the station name dimensions and date/times indices, but at least this could/would be lazy loaded. When doing so on in memory datasets, I foresee a fairly large penalty depending on how it is done.

The big question mark is whether xarray is a good idea for cases down the track when we need to write to disk while using (large data set). May or may not need to be in scope however these days with RAM availability. Still, science has a way of filling available memory.

Describe the solution you'd like / Describe alternatives you've considered

Possibly, use the netcdf4 bindings directly to create a new file on disk, then load to xarray.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant