Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intake conversion TemperatureSalinityDiagrams #371

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rbeucher
Copy link
Contributor

@rbeucher rbeucher commented Jun 6, 2024

Following the discussion in issue #313, we propose converting the recipes to use Intake, given that the Cookbook is no longer supported and the ACCESS-NRI Intake catalog is now available.

A few months ago, @max-anu began working on this transition. This pull request contains the changes @max-anu made to the notebook specified in the title.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

@marc-white marc-white left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just discovered GitHub doesn't let me comment directly on Jupyter notebooks :( But without having run this through to check it, I can see a couple of issues:

  • The Client() call has picked up a hardcoded TCP address - I don't think that should be there;
  • All of the load_* functions set up an array called darray, rename this array by reference to something else halfway through (e.g., salt in load_salinity), do some more operations on the renamed object, and then return it. This is probably an attempt to make it clear that we're making an array that represents some value, but should this be changed so we just use one variable name throughout to avoid confusion? (We can always comment that we're making an array inside those functions.)
  • There's a stray %tb (traceback) call right before we go to calculate and plot the bins - is this showing an actual error traceback, or is it a development remnant?

@marc-white
Copy link
Contributor

Further issues:

  • The start of the MOM5 block in the notebook does some things to define a variable salt, but this then gets overridden by calling the load_salinity function. I'm assuming that the first iteration of salt is setting up the array slicing - if so, we should give it a different name to avoid confusion (but I'll check that first).
  • The MOM6 catalog being looked for is not available (panant-01-zstar-v13). Will need to track down what the catalog is meant to be.

@anton-seaice
Copy link
Collaborator

Thanks Marc!

  1. If you click the ReviewNB button - can you make comments in the ReviewNB app ?
  2. Feel free to just push the changes to this branch.
  3. There is an open issue to add some MOM6 standalone runs to the catalog. It doesn't include panant-01-zstar-v13 at this point ... For this notebook one of the other runs listed there might be equivalent.

@marc-white
Copy link
Contributor

@anton-seaice I'll have a go at ReviewNB. I hit the button for it, then had my usual reaction of disgust when I found I'd need to create an account and bailed 😝

Roger, will make changes to the branch once I've worked out what's needed.

@marc-white
Copy link
Contributor

Do we normally leave outputs in the committed version of the notebook, or do we try to make it 'clean'?

@edoddridge
Copy link
Collaborator

We leave the outputs so that they render nicely in the read the docs website, e.g. https://cosima-recipes.readthedocs.io/en/latest/Recipes/Along-slope-velocities.html

@navidcy
Copy link
Collaborator

navidcy commented Aug 19, 2024

Also ensure all the cells run sequentially before merging or when reviewing

@@ -30,9 +30,11 @@
"outputs": [],
Copy link
Contributor

@marc-white marc-white Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #8.        salt = darray

Is there a point to this renaming beyond trying to make it clear we get an array out of the catalog query, and that array represents the data points we want to look at?

It might be clearer to keep the name as darray until the call to gsw_SA_from_SP, which I think is the more natural place to make the change.


Reply via ReviewNB

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

happy either way!

@@ -30,9 +30,11 @@
"outputs": [],
Copy link
Contributor

@marc-white marc-white Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #8.        temp = darray

See comment above re: salt


Reply via ReviewNB

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly happy!

@@ -30,9 +30,11 @@
"outputs": [],
Copy link
Contributor

@marc-white marc-white Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #9.        area = darray

See comment above re: salt


Reply via ReviewNB

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do whatever feels natural

@@ -30,9 +30,11 @@
"outputs": [],
Copy link
Contributor

@marc-white marc-white Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #1.    # todo delete:

Exactly what's todo here?


Reply via ReviewNB

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure, if nothing is clear then delete the comment!

@@ -30,9 +30,11 @@
"outputs": [],
Copy link
Contributor

@marc-white marc-white Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a stray debugging thing that got committed - should be removed.


Reply via ReviewNB

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove it!

@@ -30,9 +30,11 @@
"outputs": [],
Copy link
Contributor

@marc-white marc-white Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block doesn't work - the key it looks for (panant-01-zstar-v13) doesn't exist.


Reply via ReviewNB

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marc-white
Copy link
Contributor

@dougiesquire I've been playing with the Intake catalog to try and find some replacement MOM6 data, and I've hit an interesting snag.

The only way I could think of to locate MOM6 data was to search for catalogs that contain the MOM6 keywords I know from the notebook:

In: catalog.search(variable=['thetao', 'so', 'areacello'], frequency="1mon").keys()
Out: ['cmip5_al33', 'cmip6_oi10', 'cmip6_fs38', 'cmip5_rr3']

OK, so I'll use one of those... except that it crashes out the functions in the notebook that load up the salinity, temperature, etc. data. The reason is because the MOM5 catalog used can be searched for variable; however, the MOM6 catalogs seem to be searchable by variable_id instead:

In: catalog["01deg_jra55v13_ryf9091"].search(variable="salt")
Out: 01deg_jra55v13_ryf9091 catalog with 5 dataset(s) from 1593 asset(s): ...

In: catalog["cmip6_fs38"].search(variable_id="thetao")
Out: cmip6-fs38 catalog with 1210 dataset(s) from 14184 asset(s): ...

Is this expected/desired behaviour? I'm slightly worried that different catalogs seem to have a different kwarg to search for variable names... although the top-level catalog search managed to find the right sub-catalogs using the kwarg variable, so I'm guessing there must be some layer that interprets this difference?

@dougiesquire
Copy link
Collaborator

Hi @marc-white. Yup, that's expected. The ACCESS-NRI Intake catalog has the ability to include Intake-ESM datastores created by others (for example the NCI-managed CMIP6 datastores that you've ended up at). The columns in these datastores are not necessarily (in fact are usually not) the same as those in the catalog or in the datastores generated by access_nri_intake. To include these in the catalog a translation is done from the datastore columns to the catalog columns, so that datastores can be searched for in the same way.

There're some details about this here and I'm happy to also discuss in person.

Regarding this PR, the data this notebook is after is not (yet) in the catalog. There's some related discussion here about adding COSIMA MOM6 experiments, but no one has done it yet. Is this something you could take on?

@marc-white
Copy link
Contributor

Regarding this PR, the data this notebook is after is not (yet) in the catalog. There's some related discussion here about adding COSIMA MOM6 experiments, but no one has done it yet. Is this something you could take on?

Yes, I can take a look at that.

@navidcy navidcy changed the title Intake conversion TemperatureSalinityDiagrams_mom5_mom6 Intake conversion TemperatureSalinityDiagrams Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

7 participants