Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with bigtiff exported image stacks #72

Closed
unidesigner opened this issue Nov 21, 2019 · 12 comments
Closed

Issue with bigtiff exported image stacks #72

unidesigner opened this issue Nov 21, 2019 · 12 comments

Comments

@unidesigner
Copy link

If I export large 2D TIFF images with the bigtiff option set to True using imageio in Python, I get FormatError("TIFF signature invalid.") in n5gest when importing. Is this an issue of the upstream TIFF library or should bigtiff be supported and its something else?

@aschampion
Copy link
Owner

The upstream TIFF library does not yet support bigtiff (image-rs/image-tiff#3). TIFF is a minefield of a million formats masquerading as one.

Since PNG doesn't have the 32-bit offset/4GB issue, have you tried importing that way since it should support images up to 2^31 * 2^31? I know there a few issues with PNG right now (can scramble decoding, no 16-bpc support), but I have PRs for all of these upstream and will be making a new n5gest release when they land.

@unidesigner
Copy link
Author

Yes, I can use PNGs for now and this works, thanks for the hint.

Offtopic question: Is there a way to skip/discard empty (all 0) 2d regions/3d blocks from writing? Would this be difficult to add as an option? This feature would be quite helpful if you are dealing with data with a lot of black area around the actual data content.

@aschampion
Copy link
Owner

It wouldn't be difficult to add. It's a bit ambiguous with N5 because there's no spec-blessed fill value with which to detect "empty" regions. Since we're currently planning to switch to the Zarr interface anyway and exposing that over N5 datasets, I could go ahead and add it as a special case since I should be better supported later.

@unidesigner
Copy link
Author

It would be useful to me to have this as an option.

Do you plan to change the interface of pyn5 to be more zarr-like, or supporting N5 datasets in zarr? Where may I follow these developments? Thanks.

@aschampion
Copy link
Owner

Do you plan to change the interface of pyn5 to be more zarr-like, or supporting N5 datasets in zarr?

Changing pyn5 to be more zarr-like seems likely because pyn5 is mostly concerned with presenting a numpy-like interface, for which the block-header model of n5 is already superfluous. But it's a decision more likely to be made by @clbarnes or @pattonw before I get that deep in the stack. I'd like pyn5 (and the rust library behind it) to be as seamless as cloud-volume but with zarr, N5, neuroglancer chunk (volume/compressed seg), and KLB (read only) backends, and that's where my roadmap goes.

Where may I follow these developments?

The planning has been private to our lab so far, but not out of any secrecy. Things will probably start moving when I find weekend time to prototype the zarr implementation (which I project at 1.5 days), which means whenever I get done with my current weekend projects improving 16-bit image support in Rust. I'll ping you when the zarr repository goes up.

@clbarnes
Copy link
Contributor

clbarnes commented Nov 23, 2019

Certainly my contributions to pyn5 have had the goal of API compatibility with h5py. In encouraging the adoption of block formats (and switching over my own stuff), having it as a drop-in replacement was valuable. Z5py has a similar design goal, although it's a little further from API parity. Zarr-python has no such goal, which of course gives it much greater flexibility and extensibility.

Pyn5 is basically one short rust file which is pretty specific to N5 (i.e. most of it would change anyway if we were to add different backends), and a bunch of python code to make it behave like h5py. If we were going to change both of those things, we'd practically be writing a new library, and so it might not be a bad idea to actually write a new library and let pyn5 stick to its current small job for N5.

@clbarnes
Copy link
Contributor

@unidesigner zarr-python has (experimental) N5 support; its third python implementation...

I'd need to read a bit more of zarr-python's codebase to see if this is feasible but if we were to pursue an API like that library, it would be nice if it were possible to make drop-in classes which could actually be used with the existing zarr-python. The way its stores work would allow this, although I'm not sure the data IO code would allow easy delegation to an underlying library.

@aschampion
Copy link
Owner

This is implemented (suboptimally) in dev branch now, so one can do

cargo run -- import my.n5 dataset test3 '{"type": "gzip"}' 64,64,64 data/*.tif --elide_fill_value 0

and 0 blocks will be omitted. Open to suggestions for a better name.

@unidesigner
Copy link
Author

Thanks, @aschampion - this will be very useful to me.

@unidesigner
Copy link
Author

@aschampion - I just noticed that the n5-spark tools also do not seem to support bigtiff images as input for the conversion to n5 see issue. I wonder how they are converting large XY images to n5...

@aschampion
Copy link
Owner

I'm adding BigTIFF support to the image-rs/image-tiff library, but it's a very low priority thing for me so will probably take awhile since I have a few other PRs to land first, then have to wait for releases to trickle down to the image crate.

If this is something that's blocking you let me know and we can talk about how to get a patched version running sooner.

@unidesigner
Copy link
Author

No worries, this is not urgent/blocking at all. I'm currently using the variant with compressed PNGs which works for my purposes, just maybe a bit slow for the compression/decompression step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants