Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An Anomaly With NetCDF Bundling #42

Closed
AFg6K7h4fhy2 opened this issue Nov 19, 2024 · 1 comment · Fixed by #48
Closed

An Anomaly With NetCDF Bundling #42

AFg6K7h4fhy2 opened this issue Nov 19, 2024 · 1 comment · Fixed by #48
Assignees
Labels
bug Something does not work as expected. Low Priority A task that is of lower relative priority.

Comments

@AFg6K7h4fhy2
Copy link
Collaborator

This issue describes a previously undetected error that arrives with the author's updating of __init__.py.

Presently, the following exists (the author believes correctly) in pyproject.toml:

packages = [{include = "forecasttools"}]
include = [
    { path = "forecasttools/location_table.parquet", format = "sdist" },
    { path = "forecasttools/location_table.parquet", format = "wheel" },
    { path = "forecasttools/example_flusight_submission.parquet", format = "sdist" },
    { path = "forecasttools/example_flusight_submission.parquet", format = "wheel" },
    { path = "forecasttools/example_flu_forecast_wo_dates.nc", format = "sdist" },
    { path = "forecasttools/example_flu_forecast_wo_dates.nc", format = "wheel" },
    { path = "forecasttools/example_flu_forecast_w_dates.nc", format = "sdist" },
    { path = "forecasttools/example_flu_forecast_w_dates.nc", format = "wheel" },
    { path = "forecasttools/nhsn_hosp_COVID.parquet", format = "sdist" },
    { path = "forecasttools/nhsn_hosp_COVID.parquet", format = "wheel" },
    { path = "forecasttools/nhsn_hosp_flu.parquet", format = "sdist" },
    { path = "forecasttools/nhsn_hosp_flu.parquet", format = "wheel" },
]

However the following:

# location table (from Census data)
with importlib.resources.files(__package__).joinpath(
    "location_table.parquet"
).open("rb") as f:
    location_table = pl.read_parquet(f)

# load example flusight submission
with importlib.resources.files(__package__).joinpath(
    "example_flusight_submission.parquet"
).open("rb") as f:
    example_flusight_submission = pl.read_parquet(f)

# load example fitting data for COVID (NHSN, as of 2024-09-26)
with importlib.resources.files(__package__).joinpath(
    "nhsn_hosp_COVID.parquet"
).open("rb") as f:
    nhsn_hosp_COVID = pl.read_parquet(f)

# load example fitting data for influenza (NHSN, as of 2024-09-26)
with importlib.resources.files(__package__).joinpath(
    "nhsn_hosp_flu.parquet"
).open("rb") as f:
    nhsn_hosp_flu = pl.read_parquet(f)

# load light idata NHSN influenza forecast wo dates (NHSN, as of 2024-09-26)
with importlib.resources.files(__package__).joinpath(
    "example_flu_forecast_wo_dates.nc"
).open("rb") as f:
    nhsn_flu_forecast_wo_dates = az.from_netcdf(f)

# load light idata NHSN influenza forecast w dates (NHSN, as of 2024-09-26)
with importlib.resources.files(__package__).joinpath(
    "example_flu_forecast_w_dates.nc"
).open("rb") as f:
    nhsn_flu_forecast_w_dates = az.from_netcdf(f)

when converted BACK to this:

# location table (from Census data)
with importlib.resources.path(
    __package__, "location_table.parquet"
) as data_path:
    location_table = pl.read_parquet(data_path)

# load example flusight submission
with importlib.resources.path(
    __package__, "example_flusight_submission.parquet"
) as data_path:
    dtypes_d = {"location": pl.Utf8}
    example_flusight_submission = pl.read_parquet(data_path)

# load example fitting data for COVID (NHSN, as of 2024-09-26)
with importlib.resources.path(
    __package__, "nhsn_hosp_COVID.parquet"
) as data_path:
    nhsn_hosp_COVID = pl.read_parquet(data_path)

# load example fitting data for influenza (NHSN, as of 2024-09-26)
with importlib.resources.path(
    __package__, "nhsn_hosp_flu.parquet"
) as data_path:
    nhsn_hosp_flu = pl.read_parquet(data_path)

# load light idata NHSN influenza forecast (NHSN, as of 2024-09-26)
with importlib.resources.path(
    __package__, "example_flu_forecast_wo_dates.nc"
) as data_path:
    nhsn_flu_forecast_wo_dates = az.from_netcdf(data_path)


with importlib.resources.path(
    __package__, "example_flu_forecast_w_dates.nc"
) as data_path:
    nhsn_flu_forecast_w_dates = az.from_netcdf(data_path)

solved the following error:

File ~/Library/Caches/pypoetry/virtualenvs/forecasttools-06IzlalN-py3.12/lib/python3.12/site-packages/xarray/backends/file_manager.py:211, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    210 try:
--> 211     file = self._cache[self._key]
    212 except KeyError:

File ~/Library/Caches/pypoetry/virtualenvs/forecasttools-06IzlalN-py3.12/lib/python3.12/site-packages/xarray/backends/lru_cache.py:56, in LRUCache.__getitem__(self, key)
     55 with self._lock:
---> 56     value = self._cache[key]
     57     self._cache.move_to_end(key)

KeyError: [<class 'h5netcdf.core.File'>, (<_io.BufferedReader name='/Users/trevormartin/Documents/GitHub/CDC-CFA-Invested/forecasttools-py/forecasttools/example_flu_forecast_w_dates.nc'>,), 'r', (('decode_vlen_strings', True), ('driver', None), ('invalid_netcdf', None)), '72e69dc5-b04e-41e9-bab1-159720327227']

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[2], line 12
      6 # xr.set_options(display_expand_data=False, display_expand_attrs=False)
      7 
      8 # load example forecast(s)
      9 idata = forecasttools.nhsn_flu_forecast_w_dates
---> 12 forecast_df = forecasttools.idata_forecast_w_dates_to_df(
     13     idata_w_dates=idata,
     14     location="TX",
...
File h5py/h5fd.pyx:154, in h5py.h5fd.H5FD_fileobj_get_eof()

File h5py/h5fd.pyx:154, in h5py.h5fd.H5FD_fileobj_get_eof()

ValueError: seek of closed file

The contents of __init__.py ought to be investigated further.

@AFg6K7h4fhy2 AFg6K7h4fhy2 added the bug Something does not work as expected. label Nov 19, 2024
@AFg6K7h4fhy2 AFg6K7h4fhy2 self-assigned this Nov 19, 2024
@AFg6K7h4fhy2 AFg6K7h4fhy2 added the Low Priority A task that is of lower relative priority. label Nov 25, 2024
@AFg6K7h4fhy2
Copy link
Collaborator Author

It is worth noting here that the above error was originally detected during the development of the demonstration for NNH that occurred in this PR: #40.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something does not work as expected. Low Priority A task that is of lower relative priority.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant