-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Appending ndpyramid data along a time dimension #56
Comments
I second this motion. The way it handles the data now takes away the advantage of zarr and larger-than-RAM datasets that are so common in climate and weather on the processing end. I just started poking at this now, but perhaps this could be as simple as regenerating the metadata after every write? |
Any updates on this functionality ? |
Hey there @keltonhalbert, thanks for opening the issue. It's possible something is going wrong in the generation phase or maybe in the maps library. When you are appending to the existing zarr store, are you reconsolidating the metadata? I think a good start would be to create a MRE to build both a standard pyramid and a pyramid created from appending and see if there are any obvious differences. |
I think @keltonhalbert has a scenario similar to mine, which would be something like this: import xarray as xr
import rioxarray # here so that we can use rioxarray functions
from ndpyramid import pyramid_reproject
# get the sample dataset, let's pretend this is a timestamp of a 1 hour dataset such as MRMS
ds = xr.tutorial.open_dataset('air_temperature').isel(time=slice(1))
ds = ds.rio.write_crs("EPSG:4326")
dt = pyramid_reproject(ds, levels=2, clear_attrs=False, pixels_per_tile=64)
dt.to_zarr("fake_mrms/air_temperature_t1.zarr", mode="w")
# one hour later, we get a new timestamp and want to append to the existing store
ds = xr.tutorial.open_dataset('air_temperature').isel(time=slice(2))
ds = ds.rio.write_crs("EPSG:4326")
dt = pyramid_reproject(ds, levels=2, clear_attrs=False, pixels_per_tile=64)
# what do I do now to append to the existing along the time dimension?
# this works, but reading the data afterwards is not that trivial
dt.to_zarr("fake_mrms/air_temperature_t2.zarr", mode="w") # ??? |
Hello,
I'm interested in using ndpyramid and the CarbonPlan Maps visualization platform for displaying data, with the unique quirk that I would like to be able to append to an existing Zarr store as new data becomes available. I am working with 2D images that vary in time, with all other grid attributes effectively static. Perhaps this is the wrong repository to ask this question, since this may be a quirk of the mapping framework or the zarr javascript library, but figured this would be worth a try.
If I read all time steps into memory, tile, and then write, the data plays nicely with the CarbonPlan Maps viewer. This is effectively following the "3d, one variable, multiple time points" demo. However, when I try to tile one time-step at a time and append to an existing Zarr store, something about the data structure or metadata structure does not behave with the map viewer.
Reading an entire datasets temporal range into memory before tiling is very memory inefficient, especially for high-fidelity datasets or long time-range datasets. Is there a better or preferred means of being able to achieve temporal appends of ndpyramid stores?
The text was updated successfully, but these errors were encountered: