Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hdf5 chunk defaults and handling #781

Open
bnlawrence opened this issue Jun 5, 2024 · 0 comments
Open

hdf5 chunk defaults and handling #781

bnlawrence opened this issue Jun 5, 2024 · 0 comments
Labels
API review (4.0.0) enhancement New feature or request

Comments

@bnlawrence
Copy link

The current behaviour when reading a chunked file is somewhat surprising (to me). If one reads this variable:

float UM_m01s02i205_vn1106(time, latitude, longitude) ;
		# skip uninteresting attributes for this issue
		UM_m01s02i205_vn1106:_Storage = "chunked" ;
		UM_m01s02i205_vn1106:_ChunkSizes = 1, 1920, 2560 ;

I see the following unexpected result:

In [30]: g = cf.read('double-chunking-testc.nc')[0]
In [31]: g.data.nc_hdf5_chunksizes()
Out[31]: ()

This is not a bug, insofar as it is the expected behaviour of the code - by construction cf-python currently doesn't remember HDF chunksizes from the read.

Should it? If so, it could be done, possibly with certain caveats on when that's a sensible thing to do, and it may well forget them when certain operations are applied (e.g. when aggregating files with different HDF chunks, when subspacing, when adding/removing/transposing dimensions, etc.).

Another V4.0 issue!

@bnlawrence bnlawrence added enhancement New feature or request API review (4.0.0) labels Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API review (4.0.0) enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant