Skip to content

Commit

Permalink
Climatological op (#290)
Browse files Browse the repository at this point in the history
### Pull Request Checklist:
- [ ] This PR addresses an already opened issue (for bug fixes /
features)
    - This PR fixes #xyz
- [ ] (If applicable) Documentation has been added / updated (for bug
fixes / features).
- [x] (If applicable) Tests have been added.
- [x] This PR does not seem to break the templates.
- [x] HISTORY.rst has been updated (with summary of main changes).
- [x] Link to issue (:issue:`number`) and pull request (:pull:`number`)
has been added.

### What kind of change does this PR introduce?
xscen.aggregate.climatological_mean is replaced by
xscen.aggregate.climatological_op.

climatological_op permits to apply operations ('op') other than 'mean'
to the input dataset.
operations implemented: ['max', 'mean', 'median', 'min', 'std', 'sum', 'var', 'linregress']

other additions:
- argument 'min_periods' can be passed as a 0 < real value <= 1 to
restrict calculation to a percentage of available values in a period.
- argument 'interval' has been renamed to 'stride'
- flag 'rename_variables' == True will rename variables in output to
{input_var_name}\_clim\_{op} to facilitate combining output from
multiple operations in one ds.
- flag 'horizons_as_dim' == True will restructure the output with horizons
and {freq} != 'year' as new coordinates and dimensions.

Modifies the "2 Getting Started" and "6 Config" notebooks to use climatological_op with
option 'mean'.

### Does this PR introduce a breaking change?
No, climatological_mean is retained and calls climatological_op.
Tests for climatological_mean were replaced for a single test for the future warning.

### Other information:
Uses a wrapper for scipy.stats.linregress for trend calculation with xarray.
  • Loading branch information
vindelico authored Dec 19, 2023
2 parents acb0f73 + 199a796 commit 9bb7d56
Show file tree
Hide file tree
Showing 9 changed files with 831 additions and 239 deletions.
5 changes: 4 additions & 1 deletion CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,16 @@ Changelog

v0.8.0 (unreleased)
-------------------
Contributors to this version: Gabriel Rondeau-Genesse (:user:`RondeauG`), Pascal Bourgault (:user:`aulemahal`), Juliette Lavoie (:user:`juliettelavoie`), Sarah-Claude Bourdeau-Goulet (:user:`sarahclaude`), Trevor James Smith (:user:`Zeitsperre`).
Contributors to this version: Gabriel Rondeau-Genesse (:user:`RondeauG`), Pascal Bourgault (:user:`aulemahal`), Juliette Lavoie (:user:`juliettelavoie`), Sarah-Claude Bourdeau-Goulet (:user:`sarahclaude`), Trevor James Smith (:user:`Zeitsperre`), Marco Braun (:user:`vindelico`).

Announcements
^^^^^^^^^^^^^
* `xscen` now adheres to PEPs 517/518/621 using the `setuptools` and `setuptools-scm` backend for building and packaging. (:pull:`292`).

New features and enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* New function ``xscen.indicators.select_inds_for_avail_vars`` to filter the indicators that can be calculated with the variables available in a ``xarray.Dataset``. (:pull:`291`).
* Replaced aggregation function ``climatological_mean()`` with ``climatological_op()`` offering more types of operations to aggregate over climatological periods. (:pull:`290`)
* Added the ability to search for simulations that reach a given warming level. (:pull:`251`).
* ``xs.spatial_mean`` now accepts the ``region="global"`` keyword to perform a global average (:issue:`94`, :pull:`260`).
* ``xs.spatial_mean`` with ``method='xESMF'`` will also automatically segmentize polygons (down to a 1° resolution) to ensure a correct average (:pull:`260`).
Expand All @@ -25,6 +27,7 @@ New features and enhancements

Breaking changes
^^^^^^^^^^^^^^^^
* ``climatological_mean()`` has been replaced with ``climatological_op()`` and will be abandoned in a future version. (:pull:`290`)
* ``experiment_weights`` argument in ``generate_weights`` was renamed to ``balance_experiments``. (:pull:`252`).
* New argument ``attribute_weights`` to ``generate_weights`` to allow for custom weights. (:pull:`252`).
* For a sequence of models, the output of ``xs.get_warming_level`` is now a list. Revert to a dictionary with ``output='selected'`` (:pull:`270`).
Expand Down
38 changes: 27 additions & 11 deletions docs/notebooks/2_getting_started.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"- `regrid_dataset` to regrid all data to a common grid.\n",
"- `train` and `adjust` to bias correct the raw simulations.\n",
"- `compute_indicators` to compute a list of indicators.\n",
"- `climatological_mean` and `spatial_mean` for spatio-temporal aggregation.\n",
"- `climatological_op` and `spatial_mean` for spatio-temporal aggregation.\n",
"- `compute_deltas` to compute deltas.\n",
"- `ensemble_stats` for ensemble statistics.\n",
"- `clean_up` for minor adjustments that have to be made in preparation for the final product.\n",
Expand Down Expand Up @@ -1014,18 +1014,25 @@
"source": [
"## Spatio-temporal aggregation\n",
"\n",
"### Climatological mean\n",
"### Climatological operations\n",
"\n",
"`xs.climatological_mean` is used to compute a *n*-year average over `ds.time.dt.year`.\n",
"`xs.climatological_op` is used to perform *n*-year operations over `ds.time.dt.year`. \n",
"\n",
"**NOTE:** The aggregation is over *year*, __*not* over *time*__. For example, if given monthly data, the climatological average will be computed separately for January, February, etc. This means that the data should already be aggregated at the required frequency, for example using `xs.compute_indicators` to compute yearly, seasonal, or monthly indicators.\n",
"**NOTE:** The aggregation is over *year*, __*not* over *time*__. For example, if given monthly data, the climatological operation will be computed separately for January, February, etc. This means that the data should already be aggregated at the required frequency, for example using `xs.compute_indicators` to compute yearly, seasonal, or monthly indicators.\n",
"\n",
"The optional arguments are as follow:\n",
"The function call requires a `xr.Dataset` and argument `op` specifies the operation to perform. It can be any of the following:\n",
"`['max', 'mean', 'median', 'min', 'std', 'sum', 'var', 'linregress']`.\n",
"\n",
"The optional arguments are as follows:\n",
"\n",
"- `window` indicates how many year to use for the average. Uses all available years by default.\n",
"- `min_period` minimum number of years required for a value to be computed durring the `rolling` operation.\n",
"- `interval` indicates the interval (in years) at which to provide an output.\n",
"- `periods` is a list of [start, end] of continuous periods to be considered."
"- `stride` indicates the stride (in years) at which to provide an output.\n",
"- `periods` is a list of [start, end] of continuous periods to be considered.\n",
"\n",
"Additional arguments allow to control the output of the function by automatically renaming variables to reflect the operation performed, restructuring the output dataset and setting the `to_level` attribute.\n",
"\n",
"In the following example, we will use `op='mean'`."
]
},
{
Expand All @@ -1040,8 +1047,14 @@
"ds_dict = pcat.search(processing_level=\"indicators\").to_dataset_dict()\n",
"\n",
"for key, ds in ds_dict.items():\n",
" ds_mean = xs.climatological_mean(\n",
" ds=ds, window=30, interval=10, to_level=\"30yr-climatology\"\n",
" ds_mean = xs.climatological_op(\n",
" ds=ds,\n",
" op=\"mean\",\n",
" window=30,\n",
" stride=10,\n",
" rename_variables=False,\n",
" to_level=\"30yr-climatology\",\n",
" horizons_as_dim=False,\n",
" )\n",
"\n",
" # Save to zarr\n",
Expand Down Expand Up @@ -1072,11 +1085,14 @@
"source": [
"#### Horizon coordinate and time dimension\n",
"\n",
"Even if no `interval` is called, `xs.climatological_mean` will substantially change the nature of the `time` dimension, because it now represents an aggregation over time. While no standards exist on how to reflect that in a dataset, the following was chosen for `xscen`:\n",
"Even if no `stride` is called, `xs.climatological_op` will substantially change the nature of the `time` dimension, because it now represents an aggregation over time. While no standards exist on how to reflect that in a dataset, the following was chosen for `xscen`:\n",
"\n",
"- `time` corresponds to the first timestep of each temporal average.\n",
"- `horizon` is a new coordinate that either follows the format YYYY-YYYY or a warming-level specific nomenclature.\n",
"- The `cat:frequency` and `cat:xrfreq` attributes remain unchanged."
"- The `cat:frequency` and `cat:xrfreq` attributes remain unchanged.\n",
"\n",
"\n",
"Alternatively, setting the `horizons_as_dim` argument to *True* will rearrange the dataset with a new dimension `horizon` and a dimension named according to the temporal aggregation when it is `month` or `season`, but omitting the singleton dimension `year`. The time stamps are conserved in the `time` coordinate as an array with those new dimensions."
]
},
{
Expand Down
8 changes: 4 additions & 4 deletions docs/notebooks/6_config.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@
"id": "85b36803-29b8-4993-add3-3c69e4e4a750",
"metadata": {},
"source": [
"Let's test that it is working, using `climatological_mean`:"
"Let's test that it is working, using `climatological_op`:"
]
},
{
Expand All @@ -261,7 +261,7 @@
"outputs": [],
"source": [
"# We should obtain 30-year means separated in 10-year intervals.\n",
"CONFIG[\"aggregate\"][\"climatological_mean\"]"
"CONFIG[\"aggregate\"][\"climatological_op\"]"
]
},
{
Expand All @@ -282,8 +282,8 @@
"da.name = \"test\"\n",
"ds = da.to_dataset()\n",
"\n",
"# Call climatological_mean using no argument other than what's in CONFIG\n",
"print(xs.climatological_mean(ds))"
"# Call climatological_op using no argument other than what's in CONFIG\n",
"print(xs.climatological_op(ds))"
]
},
{
Expand Down
6 changes: 4 additions & 2 deletions templates/1-basic_workflow_with_config/config1.yml
Original file line number Diff line number Diff line change
Expand Up @@ -473,11 +473,13 @@ aggregate:
processing_level: indicators
delta:
processing_level: climatology
climatological_mean: # automatically passed to the function
climatological_op: # automatically passed to the function
op: mean
window: 30
interval: 10
stride: 10
periods: [['1951', '2100']]
to_level: climatology
#periods_as_dim: True
#min_periods:
compute_deltas: # automatically passed to the function
kind: "+"
Expand Down
Loading

0 comments on commit 9bb7d56

Please sign in to comment.