Replies: 3 comments 5 replies
-
I've managed to reduce the leak by setting |
Beta Was this translation helpful? Give feedback.
-
Hi @mdespriee and welcome! That sounds like a frustrating problem. We'd love to help you debug. What you're doing should be possible and stable. But there are several things that can go wrong to cause the sort of behavior you're seeing. The most important factor is the chunking of the array and the shape of the updates. Can you share more details about the configuration of the underlying Zarr groups and arrays--shape, dtype, and chunk shape--and also the shape of the data you are appending? Some code examples for how you are creating the initial dataset and how you are doing the append would be helpful as well. |
Beta Was this translation helpful? Give feedback.
-
I'm probably having other problems in that app, in addition to the OOM. The relaunch of some failed processing might have lead to concurrent writes, explaining the strange behavior. Thanks @rabernat for taking the time |
Beta Was this translation helpful? Give feedback.
-
Hello,
I'm exploring the use of xarray+zarr to store raster data that is arriving every 5mn.
I'm storing that data by continuously appending to a zarr store in s3.
I have a processing app, that receives the file, reads the raster, prepares an xr.Dataset from it, and appends it to the zarr in s3.
I'm facing 2 problems here:
first is a memory leak. The memory usage of this processing app is ever growing, up to a SIGKILL from the system (
Worker (pid:96) was sent SIGKILL! Perhaps out of memory?
), and then starts over. So far, I haven't any clue about the object holding all that data. Is it the zarr store object ? is it at fsspec/s3fs level ? How can I pin-point the cause and work around it ? Note that I tried to recycle top-level objects (zarr store, etc...) but with no success. There are some things kept in cache behind the scene.second is a robustness/resilience problem. One of the OOM crash occured during a write to s3. Now, the zarr is corrupted:
... in merge_data_and_coords ... ValueError: conflicting sizes for dimension 'time': length 69 on 'time' and length 68 on {'time': 'precipitation_type', 'y': 'precipitation_type', 'x': 'precipitation_type'}
. Is there a way to recover / repair the dataset ? Even if I fix the OOM above, a crash during a write will happen again. Can I force xarray to ignore that kind of error (eg. force NaN on missing data) and allow future writes ?Thanks for your insights, folks !
Beta Was this translation helpful? Give feedback.
All reactions