-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OME Zarr chunking #572
Comments
ome.zarr itself does not have a default chunking, so this probably depends on the library that writes this. Did you use the java/mobie one or python for this dataset? What are the chunks? |
@tischi Do you have this dataset somewhere I could access? So I can figure out what the chunks etc are? |
running... |
@K-Meech @constantinpape @KateMoreva I am pretty sure now that the issue is that for the lower resolutions the data at the image borders is corrupt: Maybe some issue with the gzip reader or writer? Example data is here: |
Strange. I'll take a look. |
@tischi - if I recall correctly, the ome-zarr writing stuff (from within fiji) is using the same libraries as the n5 writing. Could you try writing the same dataset as n5? Does it have the same problem? |
Actually, if you just put the raw data in the same folder, I can play around with it myself :) |
should be there |
Ok - so I looked into this a bit more. It doesn't happen with n5, so it's a specific problem with the ome-zarr writing. I suspect it's something going wrong with how chunks are padded at the edges of the dataset in the lower resolution levels - probably around here: https://github.com/mobie/mobie-io/blob/main/src/main/java/org/embl/mobie/io/ome/zarr/writers/N5OMEZarrWriter.java#L227. I'll keep looking. |
@constantinpape Is there an easy way to open v0.3 ome-zarr in python to inspect individual chunks? e.g. is it supported by z5py? |
Yes, there is some functionality to access chunks directly: https://github.com/constantinpape/z5/blob/master/src/python/module/z5py/dataset.py#L477-L529 As an example you could use it like this to check if all the chunks in a scale level of an ome.zarr file exist: import z5py
with z5py.File("my-file.ome.zarr", "r") as f:
ds = f["s0"] # the name of scale level zero
# assuming a 2d dataset here, extension to 3d is trivial
for i in range(ds.chunks_per_dimension[0]):
for j in range(ds.chunks_per_dimension[1]):
chunk_id = (i, j)
print("Have chunk", chunk_id, ":", ds.chunks_exists(chunk_id)) Hope this helps / let me know if you run into any issues. |
I think that some chunks at the image boundary are corrupt (probably for all resolutions, but one sees it best at the low resolution because there are less chunks and the effect in the rendering is more evident). |
Thanks for the links @constantinpape! I'm having a few issues though. There are some slight differences in how Java writes the metadata vs python which are causing issues. E.g. in each dataset's .zarray file - Java puts these lines:
These cause errors when trying to open the dataset in z5py. Deleting the "filters" line, and changing the fill value to 0 (rather than "0") fixes this. Should I change how this is written from the java code? Or could you make z5py accept these options too? After fixing this, I can run the code you put above. But it return false for every chunk which seems unlikely! So perhaps there are some other metadata differences... |
@K-Meech what exactly are you using for writing the data? Is it based on https://github.com/saalfeldlab/n5-zarr or on something else.
The fill value
Ok, I know why. This is due to some recent change with the dimension separator that I don't support yet. |
@constantinpape yes - it's based on https://github.com/saalfeldlab/n5-zarr, with very slight differences. |
@K-Meech something with the filepath is not right. |
@K-Meech I have updated z5py so that it can deal with The string |
Thanks @constantinpape! I'll try again. |
Ok - now I get a new error:
I copied the file into my folder: /g/schwab/Kimberly/temp/SXAA03648.ome.zarr |
Ok, I can access it; will check it out later. |
I already saved quite some big files with this bug. |
You can just fix it in the metadata files, so you don't need to re-write the voxel data. You would need to adapt each '.zarray' file (inside each dataset) and change the fill_value from "0" to 0. The filters line is fine - you can leave that as is. |
Ok, there is quite a bunch of them because of all the resolution levels, but I will figure out some linux |
Took me some time but that did it 😓 |
But keep in mind that this is not only a metadata issue. I also think that there's an issue with the border chunks. |
@K-Meech I can reproduce the error you see. I am investigating it now, gonna ping you when I now more. |
Ok, the dataset can't be opened in @K-Meech in the meantime you can just use the zarr python library to read the data. It does not contain convenience function to read individual chunks; but you can just view the data in napari (see code snippet below). I did this for your data and I can't find any issues. So maybe the issues with the boundary chunks is not in writing but in reading them? import zarr
import napari
with zarr.open("./SXAA03648.ome.zarr", "r") as f:
data = f["s0"][:]
v = napari.Viewer()
v.add_image(data)
napari.run() |
Thanks @constantinpape! So - turns out I see the same issues opening with python and the zarr library. Datasets s0 and s1 look fine - but s2 and s3 show issues at the edges. E.g. for dataset s3 in napari, you see weird bands at the right side |
I see @K-Meech. Then it looks like an issue with writing the data that only occurs for the higher scales. |
I'll look into this some more, but I imagine it's an issue coming from upstream in the n5-zarr library. I didn't change anything in the downsampling etc code for the version in mobie-io. |
Yeah, I also have the feeling that we can't fully trust n5-zarr in writing the data yet. It should be added to zarr-implementations to ensure that it really conforms to the zarr standard: zarr-developers/zarr_implementations#54. |
Alright - I think I've got it now! This was actually a problem with the modifications I made to code from BigDataViewer for writing the different scale levels. Here: https://github.com/mobie/mobie-io/blob/develop/src/main/java/org/embl/mobie/io/n5/util/ExportScalePyramid.java#L152 there's a 'loopBack' where previously written levels are accessed (but only when writing very downsampled levels!). I hadn't updated the reading code here so this was misbehaving. I'll check this tomorrow, but I think it should be an easy fix. |
This is fixed now. |
@K-Meech @constantinpape
I have a feeling that the default chunking for OME.Zarr is not ideal.
It takes a long time to load with intermediates like this:
The text was updated successfully, but these errors were encountered: