-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manifest arrays use arrayv3metadata #429
base: zarr-python-3.0
Are you sure you want to change the base?
Changes from 16 commits
2a01bfa
17fd547
e5666ab
5a8cc4c
4c0b616
ac2f787
5503c60
1272051
1f36755
7098803
ce2284c
c9853d5
209dae3
e7205ef
d65e457
190c20f
47f5ddd
5d15608
908bc52
4a8bfdd
d05cec3
ff23eeb
8560f2d
97d0a71
669ce52
b794dab
825142d
6684125
5e82de4
f57b48d
b811959
8c5139b
eb2a86c
15ac7a7
5359762
c808351
bd50167
ed97704
a3c190e
95886b9
a0f72b2
aad511f
b357b04
08e877a
f040459
495d660
a262f0b
bcd68a0
f0ce778
0518488
0712979
c40915d
cdaca53
2415e07
9366d69
d590cfc
6394207
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,7 +12,6 @@ | |
from virtualizarr.readers.common import VirtualBackend | ||
from virtualizarr.types import ChunkKey | ||
from virtualizarr.utils import _FsspecFSFromFilepath, check_for_collisions | ||
from virtualizarr.zarr import ZArray | ||
|
||
|
||
class DMRPPVirtualBackend(VirtualBackend): | ||
|
@@ -378,6 +377,10 @@ def _parse_variable(self, var_tag: ET.Element) -> Variable: | |
------- | ||
xr.Variable | ||
""" | ||
from zarr.core.metadata.v3 import ArrayV3Metadata | ||
|
||
from virtualizarr.zarr import convert_to_codec_pipeline | ||
|
||
# Dimension info | ||
dims: dict[str, int] = {} | ||
dimension_tags = self._find_dimension_tags(var_tag) | ||
|
@@ -414,16 +417,27 @@ def _parse_variable(self, var_tag: ET.Element) -> Variable: | |
# Fill value is placed in zarr array's fill_value and variable encoding and removed from attributes | ||
encoding = {k: attrs.get(k) for k in self._ENCODING_KEYS if k in attrs} | ||
fill_value = attrs.pop("_FillValue", None) | ||
# create ManifestArray and ZArray | ||
zarray = ZArray( | ||
chunks=chunks_shape, | ||
dtype=dtype, | ||
fill_value=fill_value, | ||
filters=filters, | ||
order="C", | ||
# create ManifestArray | ||
metadata = ArrayV3Metadata( | ||
Comment on lines
+419
to
+420
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This call seems to have quite a lot of boilerplate that authors of virtualizarr readers will never need, so as it's generally useful perhaps we should provide our own constructor function in |
||
shape=shape, | ||
data_type=dtype, | ||
chunk_grid={ | ||
"name": "regular", | ||
"configuration": {"chunk_shape": chunks_shape}, | ||
}, | ||
chunk_key_encoding={"name": "default"}, | ||
fill_value=fill_value, | ||
codecs=convert_to_codec_pipeline( | ||
compressors=filters, | ||
dtype=dtype, | ||
filters=None, | ||
serializer="auto", | ||
), | ||
attributes=attrs, | ||
dimension_names=None, | ||
Comment on lines
+433
to
+434
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we want to specify attrs here? I feel like we want neither |
||
storage_transformers=None, | ||
) | ||
marr = ManifestArray(zarray=zarray, chunkmanifest=chunkmanifest) | ||
marr = ManifestArray(metadata=metadata, chunkmanifest=chunkmanifest) | ||
return Variable(dims=dims.keys(), data=marr, attrs=attrs, encoding=encoding) | ||
|
||
def _parse_attribute(self, attr_tag: ET.Element) -> dict[str, Any]: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we reimplement this fixture to internally just call the
array_v3_metadata_dict
fixture below?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this in 0712979 - I do like just having one method to create array v3 metadata for tests however it does mean tests have to be a bit more verbose as every codecs argument must include an
ArrayBytesCodec
(which is always{"name": "bytes", "configuration": {"endian": "little"}}
).But I'll think on ways to streamline this more...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have a look at the signature of this function, which has a lot of sane defaults, and which works for v2 and v3 metadata: https://github.com/zarr-developers/zarr-python/blob/99621ecf0b81400e323828111363fe21cf0c7592/src/zarr/core/array.py#L4008-L4030. I think we could consider adding an
ArrayV3Metadata.build
method that has a signature like this, which should make creating metadata documents a lot easier.