Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the cupid-run API? #63

Open
mnlevy1981 opened this issue Feb 14, 2024 · 4 comments
Open

Change the cupid-run API? #63

mnlevy1981 opened this issue Feb 14, 2024 · 4 comments
Assignees

Comments

@mnlevy1981
Copy link
Collaborator

It would be nice to break up examples/coupled_model/config.yml into many smaller files, and then have a mechanism to pass multiple files to cupid-run. I see two big advantages to this:

  1. It makes it much easier to run a single notebook instead of every notebook in nblibrary/
  2. We could drop the coupled_model/ directory altogether, and just have examples/ contain a bunch of YAML files

I'm not sure what we would want to do about keys in config.yaml that apply to all notebooks (data_sources, computation_config, global_params) or the Jupyter Book generation. Maybe we could keep config.yaml for global settings, and then come up with a naming convention for the per-component files. If they are all cupid-[component].yaml, then something like

$ cupid-run --global-config config.yaml cupid-*.yaml

would be the equivalent of the current cupid-run config.yaml while allowing for $ cupid-run --global-config config.yaml cupid-atm.yaml if you only want to run the atmosphere notebook. (And maybe config.yaml is the default value for --global-config so it can be omitted most of the time?)

I like the general layout of this new API, though every option / file name mentioned was "first thing that came to mind" and could likely be improved upon with a little thought.

@mnlevy1981
Copy link
Collaborator Author

Another suggestion was to add another layer to the compute_notebooks dictionary where we break the notebooks into groups by components:

compute_notebooks:

  # This is where all the notebooks you want run and their
  ### parameters are specified. Several examples of different
  ### types of notebooks are provided.

  # The first key (here simple_no_params_nb) is the name of the
  ### notebook from nb_path_root, minus the .ipynb
  index:
    parameter_groups:
      none: {}

  atmosphere:
    adf_quick_run:
      parameter_groups:
        none:
          adf_path: ../../externals/ADF
          config_path: .
          config_fil_str: "config_f.cam6_3_119.FLTHIST_ne30.r328_gamma0.33_soae.001.yaml"

  ocean:
    ocean_surface:
      parameter_groups:
        none:
          Case: b.e23_alpha16b.BLT1850.ne30_t232.054
          savefigs: False
          mom6_tools_config:
            start_date: '0091-01-01'
            end_date: '0101-01-01'
            Fnames:
              native: 'mom6.h.native.????-??.nc'
              static: 'mom6.h.static.nc'
            oce_cat: /glade/u/home/gmarques/libs/oce-catalogs/reference-datasets.yml

  land:
    land_comparison:
      parameter_groups:
        none:
          cases:
            - ctsm51d159_f45_GSWP3_bgccrop_1850pAD
            - ctsm51d159_f45_GSWP3_bgccrop_1850pSASU
          type:
            - 1850pAD
            - 1850pSASU

and then have a way to specify which component or components to run (default being "all"). If we adopt this, we may want to use the CESM convention of atm, ocn, lnd, etc

@mnlevy1981
Copy link
Collaborator Author

@rmshkv -- for now let's just add the additional layer of keys; at some point we want to add a mechanism to make it easy to run a subset of notebooks, but we don't need to figure that out right now

@mnlevy1981
Copy link
Collaborator Author

Oh, something else I was thinking about... should we mirror the keys under compute_notebooks as subdirectories in nblibrary/ (e.g. put all the ocean notebooks in examples/nblibrary/ocean/), and then default to running all the notebooks in a given directory rather than requiring them to be listed individually in the YAML file? I think we want the mom6_tools_config under ocean, rather than repeating for each notebook, and it might be the case that global_params and component_params would be sufficient to make sure we pass proper parameters to every ocean (or land or atmosphere or sea ice) notebook. This would make it easier to add new notebooks to CUPiD -- just drop it in the right subdirectory, no need to modify config.yaml at all.

@rmshkv
Copy link
Contributor

rmshkv commented Feb 27, 2024

That's a good idea! I'm not sure if there are any component diagnostics with enough specific parameters (like different regions to calculate things over, etc.) that might make putting everything in component_params inconvenient? But we could find a way to address that if it arises. I'll play around with setting it up that way!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants