Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Better logic for detection of particle handle for checkpoint files #5019

Merged
merged 4 commits into from
Oct 6, 2024

Conversation

jzuhone
Copy link
Contributor

@jzuhone jzuhone commented Oct 4, 2024

PR Summary

We do not support particle data in FLASH data before version 3, but we do support reading in FLASH 2.x datasets.

The current logic for detecting a FLASH particle file which corresponds to a FLASH plotfile checks the filename for the string "hdf5_plt" and replaces it with "hdf5_part". It then generates a file handler for the particle file:

if self.particle_filename is None:
# try to guess the particle filename
try:
self._particle_handle = HDF5FileHandler(
filename.replace("plt_cnt", "part")
)
self.particle_filename = filename.replace("plt_cnt", "part")
mylog.info(
"Particle file found: %s", self.particle_filename.split("/")[-1]
)
except OSError:
self._particle_handle = self._handle
else:
# particle_filename is specified by user
self._particle_handle = HDF5FileHandler(self.particle_filename)

This works just fine for plot and particle files that match, but this logic completely misses checkpoint files with the string "hdf5_chk". But this falls through silently, because it will not change the filename at all in line 196 above and will open a separate file handler for the same input file.

Then shortly after there is a check for the equality of the two file handles (which will fail because the same file has been opened two different times):

# Check if the particle file has the same time
if self._particle_handle != self._handle:
part_time = self._particle_handle.handle.get("real scalars")[0][1]
plot_time = self._handle.handle.get("real scalars")[0][1]
if not np.isclose(part_time, plot_time):
self._particle_handle = self._handle
mylog.warning(
"%s and %s are not at the same time. "
"This particle file will not be used.",
self.particle_filename,
filename,
)

This does not create a problem for FLASH 3 files (aside from the extra and unnecessary file handle), but for FLASH 2.5 files we fail here because they do not have the "real scalars" HDF5 dataset:

import yt
ds = yt.load("co2djj_hdf5_chk_2396")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[2], line 1
----> 1 ds = yt.load("co2djj_hdf5_chk_2396")

File ~/Source/yt/yt/_maintenance/deprecation.py:69, in future_positional_only.<locals>.outer.<locals>.inner(*args, **kwargs)
     60     value = kwargs[name]
     61     issue_deprecation_warning(
     62         f"Using the {name!r} argument as keyword (on position {no}) "
     63         "is deprecated. "
   (...)
     67         **depr_kwargs,
     68     )
---> 69 return func(*args, **kwargs)

File ~/Source/yt/yt/loaders.py:149, in load(fn, hint, *args, **kwargs)
    141     if missing := cls._missing_load_requirements():
    142         warnings.warn(
    143             f"This dataset appears to be of type {cls.__name__}, "
    144             "but the following requirements are currently missing: "
   (...)
    147             stacklevel=3,
    148         )
--> 149     return cls(fn, *args, **kwargs)
    151 if len(candidates) > 1:
    152     raise YTAmbiguousDataType(_input_fn, candidates)

File ~/Source/yt/yt/frontends/flash/data_structures.py:210, in FLASHDataset.__init__(self, filename, dataset_type, storage_filename, particle_filename, units_override, unit_system, default_species_fields)
    208 # Check if the particle file has the same time
    209 if self._particle_handle != self._handle:
--> 210     part_time = self._particle_handle.handle.get("real scalars")[0][1]
    211     plot_time = self._handle.handle.get("real scalars")[0][1]
    212     if not np.isclose(part_time, plot_time):

TypeError: 'NoneType' object is not subscriptable

This PR addresses this problem by 1) Making sure that files with "hdf5_chk" in the filename are properly handled and 2) checking explicitly for files without "real scalars".

PR Checklist

  • New features are documented, with docstrings and narrative docs
  • Adds a test for any bugs fixed. Adds tests for new features.

@jzuhone jzuhone added bug code frontends Things related to specific frontends labels Oct 4, 2024
yt/frontends/flash/data_structures.py Outdated Show resolved Hide resolved
yt/frontends/flash/data_structures.py Outdated Show resolved Hide resolved
yt/frontends/flash/data_structures.py Outdated Show resolved Hide resolved
yt/frontends/flash/data_structures.py Outdated Show resolved Hide resolved
yt/frontends/flash/data_structures.py Outdated Show resolved Hide resolved
@neutrinoceros neutrinoceros merged commit 2e65cb9 into yt-project:main Oct 6, 2024
13 checks passed
@neutrinoceros neutrinoceros added this to the 4.4.0 milestone Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug code frontends Things related to specific frontends
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants