Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve critical warnings revealed in tests #1439

Open
5 of 7 tasks
leewujung opened this issue Feb 2, 2025 · 11 comments
Open
5 of 7 tasks

Resolve critical warnings revealed in tests #1439

leewujung opened this issue Feb 2, 2025 · 11 comments
Assignees
Labels
dependencies Anything related to dependencies

Comments

@leewujung
Copy link
Member

leewujung commented Feb 2, 2025

@leewujung leewujung added the dependencies Anything related to dependencies label Feb 2, 2025
@github-project-automation github-project-automation bot moved this to Todo in Echopype Feb 2, 2025
@ctuguinay ctuguinay self-assigned this Feb 3, 2025
@ctuguinay
Copy link
Collaborator

Warning: Exception ignored in: <function EchoData.del at 0x7f32eba4d440>

Traceback (most recent call last):
File "/home/runner/work/echopype/echopype/echopype/echodata/echodata.py", line 105, in del
self.cleanup_swap_files()
File "/home/runner/work/echopype/echopype/echopype/echodata/echodata.py", line 94, in cleanup_swap_files
fs = zarr_stores[0].fs
~~~~~~~~~~~^^^
IndexError: list index out of range

The __del__ function in Echodata calls cleanup_swap_files which is only used to cleanup files created when use_swap=True in ep.open_raw. It does this by looking at all Dask arrays in Echodata and checks if original-from-zarr task name is in this graph and file associated with this to delete. However, when we chunk the echodata object calling ed.chunk or we explicitly set a data variable in the Echodata object to contain a Dask array (which was done in the test_check_echodata_backscatter_size) we still create a Dask array but none of these contain this task name original-from-zarr since the dask array is creating not from using a swap file.

@ctuguinay
Copy link
Collaborator

For this one:

Warning: The return type of Dataset.dims will be changed to return a set of dimension names in future, in order to be more consistent with DataArray.dims. To access a mapping from dimension names to lengths, please use Dataset.sizes.

I'm thinking of just replacing all .dims with .sizes instead since that's consistent between Datasets and DataArrays.

@leewujung
Copy link
Member Author

For this one:

Warning: The return type of Dataset.dims will be changed to return a set of dimension names in future, in order to be more consistent with DataArray.dims. To access a mapping from dimension names to lengths, please use Dataset.sizes.

I'm thinking of just replacing all .dims with .sizes instead since that's consistent between Datasets and DataArrays.

I couldn't find .dims usage that fits this description, so was a little confused, but maybe I missed something...

@leewujung
Copy link
Member Author

The __del__ function in Echodata calls cleanup_swap_files which is only used to cleanup files created when use_swap=True in ep.open_raw. It does this by looking at all Dask arrays in Echodata and checks if original-from-zarr task name is in this graph and file associated with this to delete. However, when we chunk the echodata object calling ed.chunk or we explicitly set a data variable in the Echodata object to contain a Dask array (which was done in the test_check_echodata_backscatter_size) we still create a Dask array but none of these contain this task name original-from-zarr since the dask array is creating not from using a swap file.

I see. But how come we don't usually error out? .cleanup_swap_files is only called if self.converted_raw_path is None:, but this will be None if the EchoData object is just converted in memory and not read from a file.

Also noticed that __read_converted is not used and seems to be superseded by from_file - do you see the same thing?

@ctuguinay
Copy link
Collaborator

I see. But how come we don't usually error out? .cleanup_swap_files is only called if self.converted_raw_path is None:, but this will be None if the EchoData object is just converted in memory and not read from a file.

Oh that's a good point, I think it's because the exception is wrapped by a warning. I wonder at what time the __del__ is called and why it's wrapped by a warning...I'll look into this.

Also noticed that __read_converted is not used and seems to be superseded by from_file - do you see the same thing?

Oh yeah I'm confused by that too...

Ok, I'll try to make sense of all of this today. I never really looked at the guts of the Echodata object and how it interacts with Python's garbage collection 😵

@ctuguinay
Copy link
Collaborator

ctuguinay commented Feb 4, 2025

I couldn't find .dims usage that fits this description, so was a little confused, but maybe I missed something...

Oh yeah here's two that I found in echopype/tests/convert/test_convert_ek80.py:

@pytest.mark.unit
def test_skip_ec150(ek80_path):
    """Make sure we skip EC150 datagrams correctly."""
    ek80_mru1_path = str(ek80_path.joinpath("RL2407_ADCP-D20240709-T150437.raw"))
    echodata = open_raw(raw_file=ek80_mru1_path, sonar_model='EK80')

    assert "EC150" not in echodata["Sonar/Beam_group1"]["channel"].values
    assert "backscatter_i" in echodata["Sonar/Beam_group1"].data_vars
    assert (
        echodata["Sonar/Beam_group1"].dims
        == {'channel_all': 1, 'beam_group': 1, 'channel': 1, 'ping_time': 2, 'range_sample': 115352, 'beam': 4}
    )


@pytest.mark.unit
def test_parse_mru0_mru1(ek80_path):
    """Make sure we parse the MRU0 and MRU1 datagrams correctly from the SWFSC RAW file."""
    ek80_mru1_path = str(ek80_path.joinpath("RL2407_ADCP-D20240709-T150437.raw"))
    echodata = open_raw(raw_file=ek80_mru1_path, sonar_model='EK80')

    # Check dimensions
    assert (
        echodata["Platform"].dims
        == {'channel': 1, 'time1': 1, 'time2': 43, 'time3': 43}
    )

I changed this to:

@pytest.mark.unit
def test_skip_ec150(ek80_path):
    """Make sure we skip EC150 datagrams correctly."""
    ek80_mru1_path = str(ek80_path.joinpath("RL2407_ADCP-D20240709-T150437.raw"))
    echodata = open_raw(raw_file=ek80_mru1_path, sonar_model='EK80')

    assert "EC150" not in echodata["Sonar/Beam_group1"]["channel"].values
    assert "backscatter_i" in echodata["Sonar/Beam_group1"].data_vars
    assert (
        echodata["Sonar/Beam_group1"].sizes
        == {'channel_all': 1, 'beam_group': 1, 'channel': 1, 'ping_time': 2, 'range_sample': 115352, 'beam': 4}
    )


@pytest.mark.unit
def test_parse_mru0_mru1(ek80_path):
    """Make sure we parse the MRU0 and MRU1 datagrams correctly from the SWFSC RAW file."""
    ek80_mru1_path = str(ek80_path.joinpath("RL2407_ADCP-D20240709-T150437.raw"))
    echodata = open_raw(raw_file=ek80_mru1_path, sonar_model='EK80')

    # Check dimensions
    assert (
        echodata["Platform"].sizes
        == {'channel': 1, 'time1': 1, 'time2': 43, 'time3': 43}
    )

@ctuguinay
Copy link
Collaborator

Ok after looking at this again, I'm thinking of just removing read_converted and and load_file since they are not called and they are superceded by both from_file and load_tree.

@ctuguinay
Copy link
Collaborator

A yup I found where it says exceptions are wrapped by warnings here https://docs.python.org/3/reference/datamodel.html#object.__del__:

Image

@ctuguinay
Copy link
Collaborator

ctuguinay commented Feb 5, 2025

Warning: Supplying chunks as dimension-order tuples is deprecated. It will raise an error in the future. Instead use a dict with dimension names as keys.

Look like one of the functions creating this problem is not currently being used (only called in a test): https://github.com/search?q=repo%3AOSOceanAcoustics%2Fechopype%20_get_auto_chunk&type=code

I'm thinking of just removing it.

Edit: Nevermind, it is still being used by set_zarr_encodings.

@ctuguinay
Copy link
Collaborator

I think I'll wait until #1447 is merged to deal with this one:

Warning: Failed to open Zarr store with consolidated metadata

since it's associated with open_converted

@ctuguinay
Copy link
Collaborator

ctuguinay commented Feb 16, 2025

There's also this additional warning:

https://github.com/OSOceanAcoustics/echopype/actions/runs/13098048680/job/36542811253?pr=1429#step:15:944

Warning: In a future version of xarray decode_timedelta will default to False rather than None. To silence this warning, set decode_timedelta to True, False, or a 'CFTimedeltaCoder' instance.

PR: #1462

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Anything related to dependencies
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.

2 participants