Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding CDEPS support #522

Merged
merged 73 commits into from
Jul 23, 2024
Merged

Conversation

uturuncoglu
Copy link
Contributor

@uturuncoglu uturuncoglu commented Jul 2, 2024

Synopsis

This PR aims to bring support for CDEPS data model driver. The new driver has CLI support at this point and can be tested with following dummy config file (partly represent realistic case and also fake parameters for data ocean) and ninja templates that can be found in the following Gist repository.

https://gist.github.com/uturuncoglu/b47e084f52698c75e6223b3f9608e526

Once those files are placed, please follow the following steps to generate namelist and stream input files (the path could be different in your end),

module load miniconda3/24.3.0
conda init
bash - need to open new shell to have changes - the base env will be activated
conda create --prefix myenv
conda activate /work2/noaa/nems/tufuk/COASTAL/ufs-coastal-app/myenv
conda install maddenp::condev
cd sorc/uwtools
export PATH=/work2/noaa/nems/tufuk/COASTAL/ufs-coastal-app/myenv/bin:$PATH
make devshell
rm datm_in datm.streams; uw cdeps atm --cycle 20240201T18:00:00 --config-file config.yaml             
rm docn_in docn.streams; uw cdeps ocn --cycle 20240201T18:00:00 --config-file config.yaml 

Type

  • Bug fix (corrects a known issue)
  • Code maintenance (refactoring, etc. without behavior change)
  • Documentation
  • Enhancement (adds new functionality)
  • Tooling (CI, code-quality, packaging, revision-control, etc.)

Impact

  • This is a breaking change (changes existing functionality)
  • This is a non-breaking change (existing functionality continues to work as expected)

Checklist

  • I have added myself and any co-authors to the PR's Assignees list.
  • I have reviewed the documentation and have made any updates necessitated by this change.

@uturuncoglu uturuncoglu changed the title Feature/cdeps Adding CDEPS support Jul 2, 2024
@uturuncoglu
Copy link
Contributor Author

I might also need to add something to the documentation but I'll check it later.

@maddenp-noaa
Copy link
Contributor

Hi @uturuncoglu -- Thank you for this excellent contribution. I know you talked with @christinaholtNOAA about this: She and I also just spoke, and think a good way forward would be for me to add some updates to this PR adding required tests and docs, and to bring things up-to-date with some recent changes on our main branch. If you're ok with that, I'll do that work and then target this PR's branch with a PR of my own. If those changes look good to you and we merge my PR onto yours, yours should then be ready for final review and a merge onto uwtools main. Please let me know if you're ok with that plan.

@uturuncoglu
Copy link
Contributor Author

@maddenp-noaa It was impossible without your help. Thanks again. Yes. Your plan looks good to me. Let me know if you need help from my side. I think you could reach to junja2 template etc. from the Gist link that you could find in the description but let me know if you need any help or information. BTW, I saw that you already implemented SCHSIM and also WW3 drivers but those are just accessible via API, is there any plan to make them accessible via CLI too. I only test the CDEPS with CLI but maybe something needs to be added to make it available also through the API. Anyway, let me know what you think?

@maddenp-noaa
Copy link
Contributor

@uturuncoglu Could you please add a file docs/shares/cdeps.yaml containing a sample UW YAML configuration file for the new CDEPS driver? You can make it somewhat generic if you like (e.g. change real filesystem paths to generic names like /path/to/some.nml), but something as complete as possible will be helpful. We'll include it in the documentation, similar to what you see here where it says

Here is a prototype UW YAML fv3: block

and we can use it for manually testing against the JSON Schema definition. If you could add it to your branch and push it, I can pick it up and work with it. Thanks in advance.

@uturuncoglu
Copy link
Contributor Author

@maddenp-noaa I included the simple file for CDEPS and also add missing namelist option for CDEPS data ocean. I might have some syntax issue in src/uwtools/resources/jsonschema/cdeps.jsonschema. It is working in my end but I think UWtools CI test is complaining about it. So, if you let me know about the solution, I could do it. BTW, the cdeps yaml files points a Jinja template. Do you also want to add it to PR? Anyway, thanks for your help.

@uturuncoglu
Copy link
Contributor Author

cdeps:
  atm_in:
    update_values:
      datm_nml:
        datamode: ATMMESH
        factorfn_data: "null"
        factorfn_mesh: "null"
        flds_co2: false
        flds_presaero: false
        flds_wiso: false
        iradsw: 1
        model_maskfile: "INPUT/irene_roms_grid_rho_ESMFmesh.nc"
        model_meshfile: "INPUT/irene_roms_grid_rho_ESMFmesh.nc"
        nx_global: 242
        ny_global: 106
        restfilm: "null"
        export_all: true
  ocn_in:
    update_values:
      docn_nml:
        datamode: some_supported_mode2
        model_maskfile: 'path/to/mesh2.nc'
        model_meshfile: "path/to/mesh2.nc"
        nx_global: 150
        ny_global: 150
        restfilm: "null"
        sst_constant_value: -1.0
        skip_restart_read: true
        import_data_fields: "none"
  atm_streams:
    streams:
      stream01:
        taxmode: limit 
        mapalgo: redist
        tinterpalgo: linear
        readmode: single
        dtlimit: 1.5
        stream_offset: 0
        yearFirst: 2008
        yearLast: 2008
        yearAlign: 2008
        stream_vectors: ['u', 'v']
        stream_mesh_file: mesh.nc
        stream_lev_dimname: "null"
        stream_data_files: [ 'a.nc' ]
        stream_data_variables: ['temperature Sa_tbot','specific_humidity Sa_shum' ]
    template_file: stream.jinja2
  ocn_streams:
    streams:
      stream01:
        taxmode: limit
        mapalgo: redist
        tinterpalgo: linear
        readmode: single
        dtlimit: 1.5
        stream_offset: 0
        yearFirst: 2008
        yearLast: 2008
        yearAlign: 2008
        stream_vectors: [ 'null' ] 
        stream_mesh_file: mesh.nc
        stream_lev_dimname: "null"
        stream_data_files: ['b.nc', 'c.nc']
        stream_data_variables: ['sst So_t', 'mask So_omask']
    template_file: stream.jinja2
  rundir: /work2/noaa/nems/tufuk/COASTAL/ufs-coastal-app/sorc/uwtools
platform:
  account: me
  scheduler: slurm

@uturuncoglu
Copy link
Contributor Author

@maddenp-noaa BTW, I need to update stream_data_variables part of the docs/shared/cdeps.yaml since that maps variables from file to CDEPS internal variables. That is just a minor issue but my config.yaml has that definition for the variables. I don't think it is cause of the issue since they are string in any way.

@maddenp-noaa
Copy link
Contributor

I think I see the issue already, though: The template contains expressions like streams.items() which expect a variable called streams to be exposed, but the code is currently passing in the value under the streams key. I could reproduce your error locally, and fixed it with this change:

(DEV-uwtools) ~/git/uwtools $ git diff -U0
diff --git a/src/uwtools/drivers/cdeps.py b/src/uwtools/drivers/cdeps.py
index 9dc7901..c833db6 100644
--- a/src/uwtools/drivers/cdeps.py
+++ b/src/uwtools/drivers/cdeps.py
@@ -139 +139 @@ class CDEPS(AssetsCycleBased):
-            values_src=self._driver_config[group]["streams"],
+            values_src=self._driver_config[group],

i.e. on line 139 of cdeps.py, remove ["streams"], which will result a bigger block of data being passed to Jinja2.

I don't think there's a problem with that, but it does mean that information that Jinja2 doesn't need will be exposed to it (the template_file key and value). I don't see a problem with that, so if you're happy with it, this could be the fix. Alternatively, you could maybe change the Jinja2 code so that it expected something like {"stream01": ..., "stream02": ..., etc.} instead of {"streams": {"stream01": ..., "stream02": ..., etc.}}. You can decide what's best.

@maddenp-noaa
Copy link
Contributor

And yes, please, make any appropriate changes to docs/shared/cdeps.yaml! It should continue to pass using uw cdeps validate command I gave earlier; if there are issues where the schema needs to be updated to validate correct YAML and you need a hand, please let me know.

@uturuncoglu
Copy link
Contributor Author

@maddenp-noaa Thanks. It fixed in cdeps.py solved the issue. I'll push the fix along with the change in the template soon.

@maddenp-noaa
Copy link
Contributor

If we remove that ["streams"] selector from cdeps.py, I think we'll also need to update some unit tests that will start failing. I can help with that if you get to a point where the driver is working as you like.

@maddenp-noaa
Copy link
Contributor

Here's the fix patch for the unit tests:

(DEV-uwtools) ~/git/uwtools $ git diff -U0
diff --git a/src/uwtools/drivers/cdeps.py b/src/uwtools/drivers/cdeps.py
index 9dc7901..c833db6 100644
--- a/src/uwtools/drivers/cdeps.py
+++ b/src/uwtools/drivers/cdeps.py
@@ -139 +139 @@ class CDEPS(AssetsCycleBased):
-            values_src=self._driver_config[group]["streams"],
+            values_src=self._driver_config[group],
diff --git a/src/uwtools/tests/drivers/test_cdeps.py b/src/uwtools/tests/drivers/test_cdeps.py
index 7db9154..66e5b9e 100644
--- a/src/uwtools/tests/drivers/test_cdeps.py
+++ b/src/uwtools/tests/drivers/test_cdeps.py
@@ -80,14 +80,14 @@ def test_CDEP_streams(driverobj, group):
-    {{ stream01.dtlimit }}
-    {{ stream01.mapalgo }}
-    {{ stream01.readmode }}
-    {{ " ".join(stream01.stream_data_files) }}
-    {{ " ".join(stream01.stream_data_variables) }}
-    {{ stream01.stream_lev_dimname }}
-    {{ stream01.stream_mesh_file }}
-    {{ stream01.stream_offset }}
-    {{ " ".join(stream01.stream_vectors) }}
-    {{ stream01.taxmode }}
-    {{ stream01.tinterpalgo }}
-    {{ stream01.yearAlign }}
-    {{ stream01.yearFirst }}
-    {{ stream01.yearLast }}
+    {{ streams.stream01.dtlimit }}
+    {{ streams.stream01.mapalgo }}
+    {{ streams.stream01.readmode }}
+    {{ " ".join(streams.stream01.stream_data_files) }}
+    {{ " ".join(streams.stream01.stream_data_variables) }}
+    {{ streams.stream01.stream_lev_dimname }}
+    {{ streams.stream01.stream_mesh_file }}
+    {{ streams.stream01.stream_offset }}
+    {{ " ".join(streams.stream01.stream_vectors) }}
+    {{ streams.stream01.taxmode }}
+    {{ streams.stream01.tinterpalgo }}
+    {{ streams.stream01.yearAlign }}
+    {{ streams.stream01.yearFirst }}
+    {{ streams.stream01.yearLast }}
@@ -145 +145 @@ def test_CDEPS__model_stream_file(driverobj):
-            values_src=driverobj._driver_config[group]["streams"],
+            values_src=driverobj._driver_config[group],

@uturuncoglu
Copy link
Contributor Author

@maddenp-noaa Okay. Working on it. I think I also found a bug in the ninja template. Let me fix that one too. Thanks again

@uturuncoglu
Copy link
Contributor Author

Okay. I pushed the changes. The validate command is fine and no issue. But I am not sure about the cdeps test ninja template. In my case, I change the following part of the ninja template from

{% if val['stream_data_files'] is string -%}
stream_data_files{{'%02d' % loop.index}}: "{{ val['stream_data_files'] }}"
{% else -%}
stream_data_files{{'%02d' % loop.index}}: ""{{ val['stream_data_files'] | join('" "') }}""
{% endif -%}

to

{% if val['stream_data_files'] | length > 1 -%}
stream_data_files{{'%02d' % loop.index}}: ""{{ val['stream_data_files'] | join('" "') }}""
{% else -%}
stream_data_files{{'%02d' % loop.index}}: "{{ val['stream_data_files'] | first }}"
{% endif -%}

to produce the namelist stream_data_files correctly. I am not sure how this needs to be reflected in the testing since it is creating template on the fly.

@maddenp-noaa
Copy link
Contributor

The unit test for the streams file just ensures that the expected data items are passed to Jinja2, and that a template file that contains references to those data items is rendered correctly. I think that's as far as we can go with the unit tests: It's up to the application to provide a Jinja2 template file that accepts those data items and produces the kind of streams file that the application requires. Our assumption, as expressed by the unit test, is that if the YAML config specifies a Jinja2 template_file that uses the keys defined under a stream?? block, then the cdeps driver will successfully render that template using those values. This leaves the application freedom to define the Jinja2 template the way it needs. Does that approach work as a solution for the Costal App? If not, can your share your concerns?

@uturuncoglu
Copy link
Contributor Author

@maddenp-noaa I think it is fine to UFS Coastal since the config file that I am using is fine. I just want to be sure that the exiting unit test is representative of the realistic case. Is there any way for me to run the unit test for cdeps implementation and check the generated namelist files.

@maddenp-noaa
Copy link
Contributor

I'm sure the unit test is not representative of the realistic use case, at least in terms of values. Here's is a namelist file created by the unit test:

&datm_nml
    anomaly_forcing = 'string'
    bias_correct = 'string'
    datamode = 'GFS'
    export_all = .true.
    factorfn_data = 'string'
    factorfn_mesh = 'string'
    flds_co2 = .true.
    flds_presaero = .true.
    flds_presndep = .true.
    flds_wiso = .true.
    flds_preso3 = .true.
    iradsw = 1
    model_maskfile = 'string'
    model_meshfile = 'string'
    nx_global = 1
    ny_global = 1
    restfilm = 'string'
    skip_restart_read = .true.
/

The namelist name, and its variable names, and the types of the values should be correct, though.

You can temporarily make this change to the test code to observe this:

$ git diff
diff --git a/src/uwtools/tests/drivers/test_cdeps.py b/src/uwtools/tests/drivers/test_cdeps.py
index 66e5b9e..5196ac9 100644
--- a/src/uwtools/tests/drivers/test_cdeps.py
+++ b/src/uwtools/tests/drivers/test_cdeps.py
@@ -53,6 +53,7 @@ def test_CDEPS_nml(caplog, driverobj, group):
     path = Path(refs(task()))
     assert dst.is_file()
     assert logged(caplog, f"Wrote config to {path}")
+    print("@@@", dst) ; 1/0
     assert isinstance(f90nml.read(dst), f90nml.Namelist)

Then run make unittest 2>&1 | grep @@@ and you'll see the paths to the namelist files created by the tests, which you can then look at. (They'll be removed automatically by pytest during some future run.)

@uturuncoglu
Copy link
Contributor Author

@maddenp-noaa Okay. Do you have also stream namelist output? BTW, when I try to run unites I am getting following error,

ERROR: usage: pytest [options] [file_or_dir] [file_or_dir] [...]
pytest: error: unrecognized arguments: --cov=uwtools -n
  inifile: /work2/noaa/nems/tufuk/COASTAL/ufs-coastal-app/sorc/uwtools/src/pyproject.toml
  rootdir: /work2/noaa/nems/tufuk/COASTAL/ufs-coastal-app/sorc/uwtools/src

I am not sure it is a bug or problem in my side.

@maddenp-noaa
Copy link
Contributor

maddenp-noaa commented Jul 23, 2024

I'm not sure where that pytest error is coming from. I don't get that. You could try re-creating your uwtools development shell. Here is a demo of me doing that:

(DEV-uwtools) ~/git/uwtools $ exit
exit
(base) ~/git/uwtools $ make clean-devenv
 Remove all packages in environment /home/maddenp/sw/conda/envs/DEV-uwtools:
make: Nothing to be done for 'clean-devenv'.
(base) ~/git/uwtools $ make devshell
< lots of output >
(DEV-uwtools) ~/git/uwtools $

A streams config file created by the unit tests looks like this:

1.5
string
single
string string
string string
string
string
1
u v
string
string
1
1
1

But this isn't really relevant for you, I think, because the test uses a simple, unrealistic Jinja2 template (click this link to view) just to show that YAML config values correctly map to Jinja2 expression variables. You should use your own Jinja2 template file in a realistic uw cdeps run and view the streams file that it creates.

@uturuncoglu
Copy link
Contributor Author

@maddenp-noaa Okay. That makes sense. Thanks.

@maddenp-noaa
Copy link
Contributor

@uturuncoglu When you're comfortable with your testing, there are a few comments under the Files changed tab -- some offer suggestions that you can accept and apply right in the GitHub interface (click Commit suggestion). And I think there's one open-ended question. After suggestions are applied (if you agree with them), and when any conversations are marked resolved, we'll be able to approve and merge this. (You'll likely also need to do a final sync with the main repo, as it has been updated yet again -- we never stop!)

@uturuncoglu
Copy link
Contributor Author

@maddenp-noaa I sync again and merge the suggestions. Let me know if you need anything from my side.

Copy link
Contributor

@christinaholtNOAA christinaholtNOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @uturuncoglu and @maddenp-noaa. This is a really nice contribution.

@maddenp-noaa
Copy link
Contributor

I'll go ahead an merge this. The CI tests failed due to some unformatted code, but I will fix that in my next PR, no worries.

@maddenp-noaa maddenp-noaa merged commit 0ec7632 into ufs-community:main Jul 23, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants