Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update xcdat_open() #1212

Merged
merged 4 commits into from
Dec 23, 2024
Merged

Update xcdat_open() #1212

merged 4 commits into from
Dec 23, 2024

Conversation

acordonez
Copy link
Collaborator

@acordonez acordonez commented Dec 18, 2024

  1. Pass "chunks" parameter to xcdat_open so that drivers can set chunk size if needed. Default is {} (see xarray.open_dataset() documentation for more details)
  2. Catch cases where dataset cannot be opened because of non-cf compliant attributes. Currently only handles the case where the calendar name contains "-" instead of "_". Can be easily expanded to handle more cases.

xcdat_open is currently used by the mean climate metrics, the modes of variability metrics, mjo metrics, and monsoon (Sperber).

@acordonez acordonez marked this pull request as ready for review December 19, 2024 00:04
@acordonez acordonez requested a review from lee1043 December 19, 2024 00:05
@acordonez
Copy link
Collaborator Author

@lee1043 This is low priority to review. For testing I've run Demos 1b, 2b, 4, and 5 as it looks like those drivers use the xcdat_open() function.

@lee1043 lee1043 added this to the 3.8.1 milestone Dec 23, 2024
Copy link
Contributor

@lee1043 lee1043 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Also checked with demo 4 notebook which ran without any issue. Thanks for the PR, @acordonez!


def xcdat_open(
infile: Union[str, list], data_var: str = None, decode_times: bool = True
infile: Union[str, list], data_var: str = None, decode_times: bool = True, chunks={}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acordonez thank you for the PR! It looks good to me in general.

This may not a big deal but I wonder if this could be chunks=None to be consistent to the default of underline function that is xarray.open_mfdataset

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lee1043 The extremes and drcdm metrics need to be able to specify the chunks to ensure that the time axis is continuous across a single chunk.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lee1043 Would chunks=None mean that no chunking is used by default? That might be fine?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lee1043 The extremes and drcdm metrics need to be able to specify the chunks to ensure that the time axis is continuous across a single chunk.

That sounds like a good reason to keep the PR as it is. Thanks for the comment!

Copy link
Contributor

@lee1043 lee1043 Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acordonez I just found that chunks={} as default causes error in modes of variability code by opening dataset with dask chunks. I can make modes of variability as a special case by having chunks=None when using xcdat_open but haven't tested other metrics. If dask chunks are needed for only a few metrics, how about setting the default as chunks=None while in those special cases use xcdat_open with chunks={}?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lee1043 That suggestions sounds good to me.

@lee1043 lee1043 merged commit 9205c7f into main Dec 23, 2024
4 checks passed
@lee1043 lee1043 deleted the ao_xcdat_open branch December 23, 2024 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants