Overwriting files that were lazily loaded results in permission error… #241

roeldegoede · 2025-01-28T16:57:04Z

…s. Loading the data first solves that problem.

Issue addressed

When reading in a sfincs model with netcdf files (as grid (future)), subgrid or as forcing), these datasets are lazily loaded. In theory, that increases performance, because data is only loaded when necesssary. However, due to lazy references to the data in the original files, the files cannot be overwritten when you write the model.

Of course, you could choose to only write the files that you changed, but often the SfincsModel.write() is used.

Explanation

Everywhere where we would like to write a netcdf, we first check if the file already exists. If that is the case, we have to load the data into memory first, such that we can overwrite the file after (after loading, the lazy references are gone).

Checklist

[x ] Updated tests or added new tests
[x ] Branch is up to date with main
Updated documentation if needed
Updated changelog.rst if needed

Additional Notes (optional)

Question is whether we even want to load and overwrite datasets that are still lazy, because in most cases that will mean that they are unchanged and loading/writing would result in unnecessary computations?

…s. Loading the data first solves that problem.

…tly return the loaded ds?

roeldegoede · 2025-01-29T09:12:00Z

Somehow, the tests that I implemented to guarantee this would work (which it perfectly did locally) fail online with he same permission errors as before. However, since everything works on my side, it's very hard to debug.. @DirkEilander or @LuukBlom any suggestions are welcome.

LuukBlom · 2025-01-29T16:24:59Z

tests/test_quadtree.py

+    try:
+        ds.to_netcdf(nc_copy)
+    except PermissionError:
+        pass


to_netcdf() can succeed or fail, doesn't affect the test at all?

use

with pytest.raises(PermissionError) as e: ds.to_netcdf(nc_copy) assert .... in str(e)

Good one, I was looking for something like that, but then forgot

LuukBlom · 2025-01-29T16:26:45Z

tests/test_quadtree.py

+    ds = utils.xu_open_dataset(nc_copy)
+
+    # Convert to dataset
+    ds = ds.ugrid.to_dataset()


here you are overwriting the ds variable with another object, which could trigger the garbage collector and release the files. m not sure if this breaks anything, but I would stay away from using the same variable name for two different objects.

Pretty sure this happens in more places in the code, I could change that.

LuukBlom · 2025-01-29T16:34:30Z

hydromt_sfincs/utils.py

+def check_exists_and_lazy(ds, file_name):
+    """If a netcdf file is read lazily, the file can not be overwritten.
+    This function checks whether the file already exists, if so, it checks
+    if the data is lazily loaded. If so, data should be loaded before writing.
+
+    Parameters
+    ----------
+    ds : xarray.Dataset, xu.UgridDataset
+        The dataset to be written to a netcdf file.
+    file_name : str
+        The path to the netcdf file.
+    """
+    if not os.path.exists(file_name):
+        return
+
+    # Check for lazy loading
+    lazy_vars = [not data_array._in_memory for data_array in ds.data_vars.values()]
+
+    # if all(lazy_vars):
+    #     return  # All variables are lazy-loaded, skip writing?
+
+    if any(lazy_vars):
+        ds.load()  # Some variables are lazy-loaded, load them into memory
+    return


Make sure to call ds.close() the file after loading the contents

We discussed earlier that ds.close() does not work for xugrid.UgridDatasets right?

Ah yes, I keep forgetting.
Unfortunately, for the same reason ds.load() will also not work then :s

Overwriting files that were lazily loaded results in permission error…

5d88339

…s. Loading the data first solves that problem.

roeldegoede requested review from DirkEilander and LuukBlom January 28, 2025 16:57

roeldegoede added 2 commits January 29, 2025 09:57

Online tests still fail, whereas locally it works .. Shoud we explici…

3af38ea

…tly return the loaded ds?

undo last commit since it didn't work ...

de4f5fe

roeldegoede marked this pull request as draft January 29, 2025 09:08

LuukBlom reviewed Jan 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overwriting files that were lazily loaded results in permission error… #241

Overwriting files that were lazily loaded results in permission error… #241

roeldegoede commented Jan 28, 2025

roeldegoede commented Jan 29, 2025

LuukBlom Jan 29, 2025

roeldegoede Jan 31, 2025

LuukBlom Jan 29, 2025

roeldegoede Jan 31, 2025

LuukBlom Jan 29, 2025

roeldegoede Jan 31, 2025

LuukBlom Jan 31, 2025

Overwriting files that were lazily loaded results in permission error… #241

Are you sure you want to change the base?

Overwriting files that were lazily loaded results in permission error… #241

Conversation

roeldegoede commented Jan 28, 2025

Issue addressed

Explanation

Checklist

Additional Notes (optional)

roeldegoede commented Jan 29, 2025

LuukBlom Jan 29, 2025

Choose a reason for hiding this comment

roeldegoede Jan 31, 2025

Choose a reason for hiding this comment

LuukBlom Jan 29, 2025

Choose a reason for hiding this comment

roeldegoede Jan 31, 2025

Choose a reason for hiding this comment

LuukBlom Jan 29, 2025

Choose a reason for hiding this comment

roeldegoede Jan 31, 2025

Choose a reason for hiding this comment

LuukBlom Jan 31, 2025

Choose a reason for hiding this comment