Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For NCEP regtests, update to spack-stack 1.6, add option for gnu compiler and add Hercules #1145

Merged
merged 18 commits into from
Apr 2, 2024
Merged
Changes from 7 commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
4cfbd73
update regtests for ncep to remove login for shell and first intel loads
JessicaMeixner-NOAA Dec 7, 2023
da63450
updates to matrix_cmake_ncep for gnu and hercules
JessicaMeixner-NOAA Dec 8, 2023
6bc9379
updates for hercules
Dec 8, 2023
4af8357
update cmake on hercules
Dec 8, 2023
665b2fb
updated hera metis paths for intel and gnu
JessicaMeixner-NOAA Dec 11, 2023
183e86a
update orion intel metis path
JessicaMeixner-NOAA Dec 12, 2023
84b4a37
Merge remote-tracking branch 'EMC/develop' into feature/gnu
JessicaMeixner-NOAA Dec 13, 2023
95446b5
update orion env variables after testing, these are now the same as h…
JessicaMeixner-NOAA Jan 5, 2024
6f338f8
Merge branch 'NOAA-EMC:develop' into feature/gnu
JessicaMeixner-NOAA Jan 5, 2024
37e45ac
Merge branch 'NOAA-EMC:develop' into feature/gnu
JessicaMeixner-NOAA Mar 12, 2024
c197660
update 1.6.0 for all platforms and to rocky-8 on hera
JessicaMeixner-NOAA Mar 12, 2024
276823d
Merge branch 'NOAA-EMC:develop' into feature/gnu
JessicaMeixner-NOAA Mar 15, 2024
688f176
update hera modules
JessicaMeixner-NOAA Mar 18, 2024
a6d3762
remove extra module for 1.6.0 rocky on hera for gnu
JessicaMeixner-NOAA Mar 18, 2024
07b92a6
add orion-intel metis path
JessicaMeixner-NOAA Mar 20, 2024
894f9a4
update hercules
JessicaMeixner-NOAA Mar 20, 2024
15fe747
initialize array that the rocky8 hera intel transition exposed.
JessicaMeixner-NOAA Mar 21, 2024
5a8da61
move hercules paths to hercules section, revert hera
JessicaMeixner-NOAA Mar 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 81 additions & 20 deletions regtests/bin/matrix_cmake_ncep
Original file line number Diff line number Diff line change
Expand Up @@ -22,21 +22,30 @@ usage ()
{
cat 2>&1 << EOF

Usage: $myname model_dir
Usage: $myname model_dir compiler
Required:
model_dir : path to model dir of WW3 source
Optional:
compiler : intel (default) or gnu
EOF
}


# Get required arguments
if [ ! $# = 0 ]
then
main_dir="$1" ; shift
if [ ! $# = 0 ]
then
compiler="$1"; shift
else
compiler='intel'
fi
else
usage
exit 1
fi



# Convert main_dir to absolute path
main_dir="`cd $main_dir 1>/dev/null 2>&1 && pwd`"
Expand All @@ -58,23 +67,65 @@ EOF
# to define headers etc (default to original version if empty)
ishera=`hostname | grep hfe`
isorion=`hostname | grep Orion`
ishercules=`hostname | grep hercules`
if [ $ishera ]
then
# If no other h, assuming Hera
batchq='slurm'
spackstackpath='/scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.0/envs/unified-env-noavx512/install/modulefiles/Core'
modcomp='stack-intel/2021.5.0'
modmpi='stack-intel-oneapi-mpi/2021.5.1'
metispath='/scratch1/NCEPDEV/climate/Matthew.Masarik/waves/opt/spack-stack/1.5.0/parmetis-4.0.3/install'
modcmake='cmake/3.23.1'
if [ $compiler = "intel" ]
then
spackstackpath='/scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.0/envs/unified-env-noavx512/install/modulefiles/Core'
modcomp='stack-intel/2021.5.0'
modmpi='stack-intel-oneapi-mpi/2021.5.1'
metispath='/scratch1/NCEPDEV/climate/Matthew.Masarik/waves/opt/hera/intel/spack-stack/1.5.0/parmetis-4.0.3/install'
modcmake='cmake/3.23.1'
elif [ $compiler = "gnu" ]
then
spackstackpath='/scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.0/envs/unified-env-noavx512/install/modulefiles/Core'
modcomp='stack-gcc/9.2.0'
spackstackpath2='/scratch1/NCEPDEV/jcsda/jedipara/spack-stack/modulefiles'
modmpi='stack-openmpi/4.1.5'
metispath='/scratch1/NCEPDEV/climate/Matthew.Masarik/waves/opt/hera/gnu/spack-stack/1.5.0/parmetis-4.0.3/install'
modcmake='cmake/3.23.1'
else
echo "Compiler $compiler not supported on hera"
exit 1
fi
elif [ $isorion ]
then
if [ $compiler = "intel" ]
then
batchq='slurm'
spackstackpath='/work/noaa/epic/role-epic/spack-stack/orion/spack-stack-1.5.0/envs/unified-env/install/modulefiles/Core'
modcomp='stack-intel/2022.0.2'
modmpi='stack-intel-oneapi-mpi/2021.5.1'
metispath='/work/noaa/marine/Matthew.Masarik/waves/opt/orion/intel/spack-stack/1.5.0/parmetis-4.0.3/install'
modcmake='cmake/3.23.1'
else
echo "Compiler $compiler not supported on orion"
exit 1
fi
elif [ $ishercules ]
then
batchq='slurm'
spackstackpath='/work/noaa/epic/role-epic/spack-stack/orion/spack-stack-1.5.0/envs/unified-env/install/modulefiles/Core'
modcomp='stack-intel/2022.0.2'
modmpi='stack-intel-oneapi-mpi/2021.5.1'
metispath='/work/noaa/marine/Matthew.Masarik/waves/opt/spack-stack/1.5.0/parmetis-4.0.3/install'
modcmake='cmake/3.23.1'
if [ $compiler = "intel" ]
then
spackstackpath='/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.0/envs/unified-env/install/modulefiles/Core'
modcomp='stack-intel/2021.9.0'
modmpi='stack-intel-oneapi-mpi/2021.9.0'
metispath='/work/noaa/marine/Matthew.Masarik/waves/opt/hercules/intel/spack-stack/1.5.0/parmetis-4.0.3/install'
modcmake='cmake/3.23.1'
elif [ $compiler = "gnu" ]
then
spackstackpath='/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.0/envs/unified-env-mvap2/install/modulefiles/Core'
spackstackpath2='/work/noaa/epic/role-epic/spack-stack/hercules/modulefiles'
modcomp='stack-gcc/11.3.1'
modmpi='stack-mvapich2/2.3.7'
metispath='/work/noaa/marine/Matthew.Masarik/waves/opt/hercules/gnu/spack-stack/1.5.0/parmetis-4.0.3/install'
modcmake='cmake/3.23.1'
else
echo "Compiler $compiler not supported on hercules"
exit 1
fi
else
batchq=
fi
Expand Down Expand Up @@ -109,6 +160,19 @@ EOF
echo 'export KMP_STACKSIZE=2G' >> matrix.head
echo 'export FI_OFI_RXM_BUFFER_SIZE=128000' >> matrix.head
echo 'export FI_OFI_RXM_RX_SIZE=64000' >> matrix.head
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JessicaMeixner-NOAA these are retained in for orion, but not present for hercules. Also, their position for orion has been moved above the module loads. Are these both intentional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't see these being used in ufs-weather-model submit scripts. I'm not sure why these are here although the ulimit -s unlimited is usually needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those three were originally recommended on WCOSS2 when I was doing testing with SCOTCH a bit ago. They were then also added to orion's job card when there were issues updating modules at a previous time and the issues seemed to go away. I suspect we may need to include them on WCOSS2 when that update happens. We have had all the export's together at after the module loads, I don't know whether ordering matters with modules/exports, but I'd vote to choose one or the other to keep all job cards consistent between platforms

I find it convenient to have the cd to regtests directory right after the #SBATCHs. Could we keep that location?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this tomorrow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MatthewMasarik-NOAA in my comparions of what is happening in the develop branch on orion and what is happening in this branch in terms of the lodation of #sbatch and the "cd regtests" it is the same. Is this different behavior for you?

Since the hercules tests are working fine without extra envrionement variables beyond the unlimit -s, I prefer to not add anything. We could add the environement variable to remove the one warning, but I'm okay leaving things as is. I'll re-run a test on hercules to see if there are any system changes causing errors for me after my successful tests earlier this week/late last week.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JessicaMeixner-NOAA, to tie up a loose end regarding the "cd regtests". Checking now the job card for hera, I see it doesn't have either the extra export's or those ulimit lines, so the "cd regtests" does come directly after the Slurm directives, as I had in mind. However, for orion, these lines are currently above the "cd regtests", so I was wrong on that. Since nothing changed, no fix needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MatthewMasarik-NOAA

I have tested that we can remove the extra orion variables for our regression tests. Orion is a bit unstable right now so we should re-test that when we have a resolution to #1152 and are ready to re-test this PR.

elif [ $batchq = "slurm" ] && [ $ishercules ]
then
echo "#SBATCH -n ${np}" >> matrix.head
echo "##SBATCH --cpus-per-task=${nth}" >> matrix.head
echo '#SBATCH -q batch' >> matrix.head
echo '#SBATCH -t 08:00:00' >> matrix.head
echo '#SBATCH -A marine-cpu' >> matrix.head
echo '#SBATCH -J ww3_regtest' >> matrix.head
echo '#SBATCH -o matrix.out' >> matrix.head
echo '#SBATCH -p hercules' >> matrix.head
echo '#SBATCH --exclusive' >> matrix.head
echo ' ' >> matrix.head
echo 'ulimit -s unlimited' >> matrix.head
elif [ $batchq = "slurm" ]
then
echo "#SBATCH -n ${np}" >> matrix.head
Expand All @@ -133,13 +197,10 @@ EOF

# Netcdf, Parmetis and SCOTCH modules & variables
echo " module purge" >> matrix.head
if [ ! -z $basemodcomp ]; then
echo " module load $basemodcomp" >> matrix.head
fi
if [ ! -z $basemodmpi ]; then
echo " module load $basemodmpi" >> matrix.head
fi
echo " module use $spackstackpath" >> matrix.head
echo " module use $spackstackpath" >> matrix.head
if [ ! -z $spackstackpath2 ]; then
echo " module use $spackstackpath2" >> matrix.head
fi
echo " module load $modcomp" >> matrix.head
echo " module load $modmpi" >> matrix.head
echo " module load $modcmake" >> matrix.head
Expand Down