Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include ILAMB output in notebook and provide link to full diagnostic package #165

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

TeaganKing
Copy link
Collaborator

@TeaganKing TeaganKing commented Jan 16, 2025

Description of changes:

  • Wrote script to generate ILAMB model_setup.txt file based on config.yml
  • Included link to ILAMB notebook which includes a link to ILAMB full output as well as a few 'key metric' plots from ILAMB directly in the jupyterbook
  • Included config files for BGC & SP cases (without hardcoded glade paths) in a parallel location to ADF config file templates.

@TeaganKing TeaganKing self-assigned this Jan 16, 2025
@TeaganKing TeaganKing requested a review from mnlevy1981 January 17, 2025 19:15
@TeaganKing TeaganKing added the lnd label Jan 17, 2025
@TeaganKing TeaganKing linked an issue Jan 17, 2025 that may be closed by this pull request
@TeaganKing
Copy link
Collaborator Author

Note the linked issue also requires workflow updates to include ILAMB and choose BGC or SP.

@TeaganKing
Copy link
Collaborator Author

TeaganKing commented Jan 21, 2025

  • Note to self: need to remove some printed statements in link_to_ILAMB.ipynb that are currently there for development purposes once we get ILAMB working with CLM6

@TeaganKing
Copy link
Collaborator Author

TeaganKing commented Jan 22, 2025

Made some changes per discussion in CUPiD meeting today.

  • Added active layer thickness plot
  • A few other minor changes (eg, remove word 'link')

Still need to do the following:

  • Add overarching "score chart"
  • Remove glade filepaths, specify them in BGC cfg file, and add to helper script to generate these
  • Add into key metrics?

@TeaganKing
Copy link
Collaborator Author

The overarching score chart is not a static image-- rather an html page. Thus, I think it'd be difficult to reproduce as a static image within the CUPiD jupyterbook output. This will also be the first thing that appears when you click on the link to the full ILAMB output, so I'm going to suggest we don't implement this as one of the plots that shows up automatically within the jupyterbook after all.

Also, for single runs, this is not the most interesting score chart...
Screenshot 2025-01-24 at 9 28 19 AM

It's really only useful for comparisons against other runs:
Screenshot 2025-01-24 at 9 30 44 AM

@mnlevy1981
Copy link
Collaborator

mnlevy1981 commented Jan 24, 2025

For key_metrics, can we run ILAMB on both the new case and the baseline? And is that table useful if there are two columns? If so, I think you can imbed one HTML page into another with something like this:

display(
    HTML(
         '<iframe src="/path/to/table.html"></iframe>'
    )
)

But I agree that it's not very useful for the single run case. If the above code works, maybe we could include it in an if print_table: block where print_table defaults to False, just so it is available in the future. If the above code doesn't work, I don't think it's worth the time and agree with

I'm going to suggest we don't implement this as one of the plots that shows up automatically within the jupyterbook after all

@mnlevy1981
Copy link
Collaborator

mnlevy1981 commented Jan 24, 2025

(my first version of the above comment relied on javascript, but I think the HTML <iframe> option would work better)

@TeaganKing
Copy link
Collaborator Author

Thanks Mike. Good point. FYI I'm still doing some final testing on making sure this runs smoothly with regridded CLM6 data....

@TeaganKing
Copy link
Collaborator Author

Removing mpiexec from command works! Updating helper script shortly to implement new script suggestion. For now, we are planning to bring this in to the external diagnostic packages example (and not key metrics until the mpiexec issue and the copying ADF output dir issue are both fixed since the latter will also apply to ILAMB).

@TeaganKing
Copy link
Collaborator Author

I'm still getting a VarNotInModel error from ILAMB-- also with the old configurations from the previous ILAMB example. Otherwise this full workflow is working.

@TeaganKing
Copy link
Collaborator Author

TeaganKing commented Jan 30, 2025

This worked with a different dataset: ilamb-run --config ../ilamb_aux/ilamb_nohoff_final_CLM.cfg --build_dir ILAMB_output/ --df_errs ../ilamb_aux/quantiles_Whittaker_cmip5v6.parquet --define_regions ../ilamb_aux/DATA/regions/LandRegions.nc ../ilamb_aux/DATA/regions/Whittaker.nc --regions global --model_setup ../examples/external_diag_packages/model_setup.txt --filter .clm2.h0.

When model_setup.txt used this filepath: /glade/campaign/cgd/tss/common/Land_Only_Simulations/CTSM52_DEV/ctsm51_ctsm51d166deadveg_1deg_CRUJRA_FLDS_ABsnoCDE_blk_A5BCD_hist/lnd/hist/
and ilamb_nohoff_final_CLM.cfg has filepaths such as /glade/campaign/cesm/community/lmwg/diag/ILAMB/DATA/biomass/Tropical/biomass_0.5x0.5.nc

The exact same setup did not work with clean directories and /glade/campaign/cesm/development/cross-wg/diagnostic_framework/CESM_output_for_testing/b.e30_beta02.BLT1850.ne30_t232.104/lnd/hist/regrid/(but did appear to run when directories were not cleaned). In this case, I get ValueError: Unable to convert from 'Unit('steradian')' to 'Unit('m2')'. and NameError: name 'logger' is not defined.

"# Model Name , Location of Files , Shift From, Shift To\n", # noqa: E501
)
ms.write(
f"CTSM51 , {base_case_output_dir}/lnd/hist/regrid/",
Copy link
Collaborator Author

@TeaganKing TeaganKing Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should add some sort of "if regrid" then append the regrid directory here. Perhaps this would be useful to in the long term create the regridded files if they don't yet exist, and in the short term print out a "warning: please run regridding with command below"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note that regridded files are required

@TeaganKing
Copy link
Collaborator Author

TeaganKing commented Feb 3, 2025

We actually want to remap with this command to avoid steradian units and instead get km^2 for area: ncremap -m /glade/campaign/cesm/cesmdata/inputdata/cpl/gridmaps/ne30pg3/map_ne30pg3_TO_fv0.9x1.25_blin.240826.nc -P clm "$file" regrid/"$file" . Thanks @slevis-lmwg for the suggestion!

Copy link
Collaborator

@mnlevy1981 mnlevy1981 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try to run this, but read through the diffs and have a few comments. Overall, this looks really good and I'm excited about adding ILAMB to external_diag_packages!

Should we address #160 in this PR, since we'll have the same problem with ILAMB and users only running CUPiD on non-land components? Seeing your change to cupid/cupid_webpage.py brought that issue to mind, but I'd be happy to wait for a later PR to try to fix it.

Comment on lines 49 to 50
if cupid_config_loc is None:
cupid_config_loc = os.path.join(cupid_root, "examples", "key_metrics")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this default be "external_diag_packages" instead of key_metrics since we won't be able to add link_to_ILAMB.ipynb in this PR?

"# Model Name , Location of Files , Shift From, Shift To\n", # noqa: E501
)
ms.write(
f"CTSM51 , {base_case_output_dir}/lnd/hist/regrid/\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of CTSM51, do we want this to be c_dict["global_params"]["base_case_name"]? Also, I'm surprised we are adding the baseline case but not the experimental case. Should we also have a line with c_dict["global_params"]["case_name"]?

Comment on lines 92 to 95
base_case_output_dir = os.path.join(
c_dict["global_params"]["CESM_output_dir"],
c_dict["global_params"]["base_case_name"],
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the spec, users can specify c_dict["global_params"]["base_case_output_dir"] if the baseline is not in the same directory as the experimental case... so I think you want something like:

if "base_case_output_dir" in c_dict["global_params"]:
    base_case_output_dir = os.path.join(
        c_dict["global_params"]["base_case_output_dir"],
        c_dict["global_params"]["base_case_name"],
    )
else:
    base_case_output_dir = os.path.join(
        c_dict["global_params"]["CESM_output_dir"],
        c_dict["global_params"]["base_case_name"],
    )

c_dict["global_params"]["CESM_output_dir"],
c_dict["global_params"]["base_case_name"],
)
with open(cupid_config_loc + "model_setup.txt", "w") as ms:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be os.path.join(cupid_config_loc, "model_setup.txt") instead of cupid_config_loc + "model_setup.txt" -- cupid_config_loc isn't guaranteed to have the trailing /

ms.write(
f"CTSM51 , {base_case_output_dir}/lnd/hist/regrid/\n",
)
print(f"wrote {cupid_config_loc}model_setup.txt")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about using os.path.join because cupid_config_loc might not have the trailing /

Comment on lines 107 to 109
print("You can now run ILAMB with the following commands:")
print("---")
print("qinteractive -l select=1:ncpus=16:mpiprocs=16:mem=100G -l walltime=06:00:00")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The qinteractive is only necessary on the NCAR computers. Maybe we could do something like

print("You can now run ILAMB with the following commands:")
print("(Users on a super computer should make sure they are on a compute node rather than a login node)")
print("---")

I'm not sure what to do about the fact that we're having trouble with MPI on derecho and casper; in general, users will need the mpiexec but for now it's probably okay to stick with the serial recommendation

# - caption: Ocean
# chapters:
# - file: ocn/ocean_surface
- file: atm/link_to_ADF
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the spacing changed here? comparing old line 195 to new line 161 I see

-         - file: atm/link_to_ADF
+        - file: atm/link_to_ADF

@@ -131,6 +131,7 @@ compute_notebooks:
parameter_groups:
none:
adf_root: ../../examples/key_metrics/ADF_output/
key_plots: ["Surface_Wind_Stress_ANN_LatLon_Vector_Mean.png", "PRECT_ANN_LatLon_Mean.png", "PS_DJF_SHPolar_Mean.png", "TaylorDiag_ANN_Special_Mean.png"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised these weren't already included in the config.yml, thanks for getting rid of the hard-coded list and replacing it with something configurable!

" * `cd CUPiD/helper_scripts`\n",
" * `./generate_ilamb_config_file.py --cupid_file ../examples/external_diag_packages/config.yml --out_dir ../../`\n",
"4) Run ILAMB with the newly created configuration file.\n",
" * `qinteractive -l select=1:ncpus=16:mpiprocs=16:mem=100G -l walltime=06:00:00`\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had recommended removing the qinteractive line from the helper script, but for the notebook maybe we can just make a note that this line is specific for NCAR users (and other supercomputers might have different ways to request compute resources)? Also, we can drop to ncpus=1:mpiprocs=1 while we work out issues in the environment

" * `qinteractive -l select=1:ncpus=16:mpiprocs=16:mem=100G -l walltime=06:00:00`\n",
" * `conda activate cupid-analysis`\n",
" * `export ILAMB_ROOT=../../ilamb_aux`\n",
" * `mpiexec ilamb-run --config ilamb_nohoff_final_CLM.cfg --build_dir bld/ --df_errs ../../ilamb_aux/quantiles_Whittaker_cmip5v6.parquet --define_regions ../../ilamb_aux/DATA/regions/LandRegions.nc ../../ilamb_aux/DATA/regions/Whittaker.nc --regions global --model_setup model_setup.txt --filter .clm2.h0.`"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should drop the mpiexec here (and add it back in once things work)

@TeaganKing TeaganKing requested a review from mnlevy1981 February 7, 2025 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants