Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEFS -CHEM PLOTS MPMD BUGFIX - Feature/gef splotmpmd #664

Merged
merged 25 commits into from
Feb 12, 2025

Conversation

Ho-ChunHuang-NOAA
Copy link
Contributor

@Ho-ChunHuang-NOAA Ho-ChunHuang-NOAA commented Feb 10, 2025

Note to developers: You must use this PR template!

Description of Changes

Code revision to have mpmd processes running in its own child working directory

Please include a summary of the changes and the related GitHub issue(s). Please also include relevant motivation and context.
EVSv2.0 GESF-CHEM MPMD Bugfix

Developer Questions and Checklist

  • Is this a high priority PR? If so, why and is there a date it needs to be merged by?
    NO

  • Do you have any planned upcoming annual leave/PTO?
    YES, APRIL 2025

  • Are there any changes needed in the times when the jobs are supposed to run/kick-off?
    NO

  • The code changes follow NCO's EE2 Standards.
  • Developer's name is removed throughout the code and have used ${USER} where necessary throughout the code.
  • References the feature branch for HOMEevs are removed from the code.
  • J-Job environment variables, COMIN and COMOUT directories, and output follow what has been defined for EVS.
  • Jobs over 15 minutes in runtime have restart capability.
  • [] If applicable, changes in the dev/drivers/scripts or dev/modulefiles have been made in the corresponding ecf/scripts and ecf/defs/evs-nco.def?
  • Jobs contain the appropriate file checking and don't run METplus for any missing data.
  • Code is using METplus wrappers structure and not calling MET executables directly.
  • Log is free of any ERRORs or WARNINGs.

Testing Instructions

Please include testing instructions for the PR assignee. Include all relevant input datasets needed to run the tests.
(1) Set HOMEevs to pr test directory
(2)
(a) export SENDMAIL=YES in PR driving script
(b) if VDATE for stats steps is the default of PDYm3, go to GEFS-ENS prep directory,
> mv chem.${PDYm4} chem.${PDYm4}_save
(c) Run following scripts in /dev/drivers/scripts/stats/global_ens
jevs_global_ens_gefs_chem_grid2obs_aeronet_stats.sh
jevs_global_ens_gefs_chem_grid2obs_airnow_stats.sh
and check for missing file email
(d) in GEFS-ENS prep directory,
> mv chem.${PDYm4}_save chem.${PDYm4}
repeat (c) step above
check for any error/warning
(3) run following scripts in ~/dev/drivers/scripts/plots/global_ens
jevs_global_ens_chem_gefs_grid2obs_aeronet_plots_last31days.sh
jevs_global_ens_chem_gefs_grid2obs_aeronet_plots_last90days.sh
jevs_global_ens_chem_gefs_grid2obs_airnow_plots_last31days.sh
jevs_global_ens_chem_gefs_grid2obs_airnow_plots_last90days.sh

@malloryprow
Copy link
Contributor

Initial look over:

  1. ush/aqm/settings.py has changes but this is a global_ens PR. Can you remove these changes? We haven't been instructed to make these changes relating to new names related to Executive Orders yet.
  2. stats/global_ens/exevs_global_ens_chem_grid2obs_stats.sh has been changed but there are no PR instructions on testing the stats step. It seems stats should be tested as well. Could the PR instructions be updated to test these changes?

@Ho-ChunHuang-NOAA
Copy link
Contributor Author

test instructions revised to include stats email message

@malloryprow
Copy link
Contributor

malloryprow commented Feb 10, 2025

Doing the missing data tests, I built a prep directory at /lfs/h2/emc/vpppg/noscrub/mallory.row/verification/EVS_PRs/pr664/evs/v2.0/prep/global_ens and linked to all chem directories except for chem.20250206. COMIN is pointing to that directory for testing.

COMOUT is /lfs/h2/emc/vpppg/noscrub/mallory.row/verification/EVS_PRs/pr664/evs/v2.0.

1. jevs_global_ens_gefs_chem_grid2obs_aeronet_stats

Log File: /lfs/h2/emc/vpppg/noscrub/mallory.row/verification/EVS_PRs/pr664/EVS/dev/drivers/scripts/stats/global_ens/jevs_global_ens_gefs_chem_grid2obs_aeronet_stats.o6914362
DATA: /lfs/h2/emc/stmp/mallory.row/evs_test/prod/tmp/jevs_global_ens_gefs_chem_grid2obs_aeronet_stats.6914362.dbqs01

2. jevs_global_ens_gefs_chem_grid2obs_airnow_stats

Log File: /lfs/h2/emc/vpppg/noscrub/mallory.row/verification/EVS_PRs/pr664/EVS/dev/drivers/scripts/stats/global_ens/jevs_global_ens_gefs_chem_grid2obs_airnow_stats.o6914417
DATA: /lfs/h2/emc/stmp/mallory.row/evs_test/prod/tmp/jevs_global_ens_gefs_chem_grid2obs_airnow_stats.6914417.dbqs01

@Ho-ChunHuang-NOAA
Copy link
Contributor Author

The mail message generated as expected but I did not receive the email. I try the cmd line emailing submission, and I received the cmd line submitted email. Do you have any suggestion on the one sent in job queue?

cat /lfs/h2/emc/stmp/mallory.row/evs_test/prod/tmp/jevs_global_ens_gefs_chem_grid2obs_aeronet_stats.6914362.dbqs01/mailmsg

cat /lfs/h2/emc/stmp/mallory.row/evs_test/prod/tmp/jevs_global_ens_gefs_chem_grid2obs_airnow_stats.6914417.dbqs01/mailmsg

@malloryprow
Copy link
Contributor

I did not receive an email either, though I put myself on MAILTO. I don't know why we didn't get it. Is the purpose of these changes to put all missing data files in 1?

@Ho-ChunHuang-NOAA
Copy link
Contributor Author

Yes, I recall NCO said there are too many missing file email and they want it to be reduced. Later NCO cancel the MAIL option during operational. This is for EMC EVS parallel only for missing data.

@Ho-ChunHuang-NOAA
Copy link
Contributor Author

Ho-ChunHuang-NOAA commented Feb 10, 2025

Yes, I too finding one email per file for stats step in one hours is too much. This changes is to put all missing file of an hour in one email.

@malloryprow
Copy link
Contributor

You are correct; however, we are going about resolving that in a different way. Here are the notes from the Verification Team Meeting on 1/7/2025 under the Missing Data Email section.

If the changes are for thise, I would back out these changes and have them in another PR. There are other changes needed to meet what is needed for how we plan on delivering missing data emails to NCO.

@Ho-ChunHuang-NOAA
Copy link
Contributor Author

Ho-ChunHuang-NOAA commented Feb 10, 2025 via email

@Ho-ChunHuang-NOAA
Copy link
Contributor Author

Backout the changes in exevs_global_ens_chem_grid2obs_stats.sh and remove the instruction for testing the changes.

@Ho-ChunHuang-NOAA
Copy link
Contributor Author

Ho-ChunHuang-NOAA commented Feb 10, 2025 via email

sync local repository 02/12/2025
@malloryprow
Copy link
Contributor

COMOUT is /lfs/h2/emc/vpppg/noscrub/mallory.row/verification/EVS_PRs/pr664/evs/v2.0/plots/global_ens/chem.20250208.

1. jevs_global_ens_chem_gefs_grid2obs_aeronet_plots_last31days

Log File: /lfs/h2/emc/vpppg/noscrub/mallory.row/verification/EVS_PRs/pr664/EVS/dev/drivers/scripts/plots/global_ens/jevs_global_ens_chem_gefs_grid2obs_aeronet_plots_last31days.o180371595
DATA: /lfs/h2/emc/stmp/mallory.row/evs_test/prod/tmp/jevs_global_ens_chem_gefs_grid2obs_aeronet_plots_last31days.180371595.cbqs01

2. jevs_global_ens_chem_gefs_grid2obs_aeronet_plots_last90days

Log File: /lfs/h2/emc/vpppg/noscrub/mallory.row/verification/EVS_PRs/pr664/EVS/dev/drivers/scripts/plots/global_ens/jevs_global_ens_chem_gefs_grid2obs_aeronet_plots_last90days.o180371597
DATA: /lfs/h2/emc/stmp/mallory.row/evs_test/prod/tmp/jevs_global_ens_chem_gefs_grid2obs_aeronet_plots_last90days.180371597.cbqs01

3. jevs_global_ens_chem_gefs_grid2obs_airnow_plots_last31days

Log File: /lfs/h2/emc/vpppg/noscrub/mallory.row/verification/EVS_PRs/pr664/EVS/dev/drivers/scripts/plots/global_ens/jevs_global_ens_chem_gefs_grid2obs_airnow_plots_last31days.o180371598
DATA: /lfs/h2/emc/stmp/mallory.row/evs_test/prod/tmp/jevs_global_ens_chem_gefs_grid2obs_airnow_plots_last31days.180371598.cbqs01

4. jevs_global_ens_chem_gefs_grid2obs_airnow_plots_last90days

Log File: /lfs/h2/emc/vpppg/noscrub/mallory.row/verification/EVS_PRs/pr664/EVS/dev/drivers/scripts/plots/global_ens/jevs_global_ens_chem_gefs_grid2obs_airnow_plots_last90days.o180371605
DATA: /lfs/h2/emc/stmp/mallory.row/evs_test/prod/tmp/jevs_global_ens_chem_gefs_grid2obs_airnow_plots_last90days.180371605.cbqs01

@Ho-ChunHuang-NOAA
Copy link
Contributor Author

No error or warning message were found in the logs file (except for IP address error). The figures look okay to me

https://www.emc.ncep.noaa.gov/users/verification/global/gefs/pr/chem/grid2obs/airnow_pm25/

@malloryprow
Copy link
Contributor

Everything looks good to me too! One time though, I see the walltime for all the scripts are 1 hour, but they all ran in less than 15 minutes. Following the guidelines from NCO, could we get the walltimes reduced to 25 minutes (both dev drivers and ecf scripts)?

@Ho-ChunHuang-NOAA
Copy link
Contributor Author

Mallory, how do you check the cpu time in the GEFS-chem plots step? "cput=" ? since resources_used.walltime = 00:00:00.

Because cput for nid* is over 12 min, will 15 min be cutting too close? Since the plots code has restart function, can we set it at higher walltime?

@Ho-ChunHuang-NOAA
Copy link
Contributor Author

Please ignore my previous comments, you did say 25 mins. I need to check my eysight.

@malloryprow
Copy link
Contributor

All good! Future reference, I usually grep for "resources_used.walltime". I don't think it works in some cases, particularly on Cactus, so if that has all zeros I grep for "stime =" and "mtime =" and do a rough estimate.

Copy link
Contributor

@malloryprow malloryprow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes are good and testing successful.

Thanks for this work @Ho-ChunHuang-NOAA!

Copy link
Contributor

@AliciaBentley-NOAA AliciaBentley-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the changes made in this PR and found them to be consistent with the MPMD Bugzilla fix for GEFS-chem. I approve this PR to be merged. Thanks!

CC @malloryprow @Ho-ChunHuang-NOAA

@malloryprow malloryprow merged commit f31754b into NOAA-EMC:develop Feb 12, 2025
@Ho-ChunHuang-NOAA
Copy link
Contributor Author

Thank you all for the help to get this PR done.

@Ho-ChunHuang-NOAA Ho-ChunHuang-NOAA deleted the feature/GEFSplotmpmd branch February 14, 2025 16:20
BinbinZhou-NOAA added a commit to BinbinZhou-NOAA/EVS that referenced this pull request Feb 19, 2025
…A/EVS into feature/href_MPMD

* 'feature/href_MPMD' of https://github.com/BinbinZhou-NOAA/EVS:
  Update naming conventions per Executive Order 14172 (NOAA-EMC#666)
  GEFS -CHEM PLOTS MPMD BUGFIX - Feature/gef splotmpmd (NOAA-EMC#664)
BinbinZhou-NOAA added a commit to BinbinZhou-NOAA/EVS that referenced this pull request Feb 20, 2025
…AA/EVS into feature/narre_MPMD

* 'feature/narre_MPMD' of https://github.com/BinbinZhou-NOAA/EVS:
  Update naming conventions per Executive Order 14172 (NOAA-EMC#666)
  GEFS -CHEM PLOTS MPMD BUGFIX - Feature/gef splotmpmd (NOAA-EMC#664)
  bugfix/long_term_stats (NOAA-EMC#663)
  Global-det SST sea ice fix (NOAA-EMC#660)
  bugfix/dev_fix_ghrsst (NOAA-EMC#662)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

global_ens - chem: Address Bugzilla 1547 - MPMD processes share the same working directory
3 participants