Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fb smc gpuport #39

Open
wants to merge 28 commits into
base: develop
Choose a base branch
from
Open

Fb smc gpuport #39

wants to merge 28 commits into from

Conversation

UKMO-lsampson
Copy link

@UKMO-lsampson UKMO-lsampson commented Apr 13, 2023

Pull Request Summary

A section (SMC spatial propagation) of the manual GPU porting efforts at the Met Office has been brought into the WW3 operational framework for testing and understanding the compatibility and flexibility of using OpenACC parallelism.

Description

The PR includes the successful merged port for the SMC propagation routines, which are enabled via a new GPU switch. This activates the OpenACC directives, hoisting arrays and inlining of subroutines that is required for performant GPU acceleration. For regular CPU compilations this has no affect, and unless compiled on a GPU architecture (such as Isambard) with the correct compiler flags, will perform as standard WW3.

We have used a separate build system on Isambard so that the GPU's can be targeted and tested as required. These are currently available under the /projects/metoffice/WW3_Isambard directory structure on Isambard.

Commit Message

Integration of the SMC propagation from manual GPU porting efforts at the Met Office into the WW3 operational framework for testing and understanding the compatibility and flexibility of using OpenACC parallelism.

Check list

  • Branch is up to date with the authoritative repository (ukmo-waves) develop branch.
  • Relative regression tests have been run. Including additional GPU regression tests run on Isambard and XCE.

Testing

  • How were these changes tested? ww3_tp2.10, ww3_tp2.16.
  • Are the changes covered by regression tests? CPU changes are covered, added GPU switches to current regression tests.
  • Have the matrix regression tests been run (if yes, please note HPC and compiler)? N
  • Please indicate the expected changes in the regression test output, (Note the list of known non-identical tests.) No expected changes.
  • Please provide the summary output of matrix.comp (matrix.Diff.txt, matrixCompFull.txt and matrixCompSummary.txt): matrix.comp provides identical output for the original regressions tests, and the new regressions test produce netcdf output which is identical to the tests with the GPU switch, according to nccmp. (On XCE)

@UKMO-lsampson UKMO-lsampson added the enhancement New feature or request label Apr 13, 2023
@UKMO-lsampson
Copy link
Author

The performance of the GPU versions for the code has not been fully analysed as there is known areas in which we will see drastic improvement, e.g. more resident GPU code (Source term, Intra-spectral routines), data optimisations.

This has also only been explicitly tested with managed memory, the explicit transfers compilation of the code is more complex and will contain more intrusive OpenACC directives.

Regressions tests run:

./bin/run_cmake_test -o both -S ../model -w work_SHRD_SMC ww3_tp2.10
./bin/run_cmake_test -o both -S -w work_SHRD ../model ww3_tp2.16
./bin/run_cmake_test -o both -S ../model -s GPU -w work_SHRD_SMC_GPU ww3_tp2.10
./bin/run_cmake_test -o both -S -s GPU -w work_SHRD_GPU ../model ww3_tp2.16
./bin/run_cmake_test -o both -S -s MPI_GPU -w work_GPU -f -p mpiexec -n 16 ../model ww3_tp2.10
./bin/run_cmake_test -o both -S -s MPI_GPU -w work_GPU -f -p mpiexec -n 16 ../model ww3_tp2.16
./bin/run_cmake_test -o both -S -s MPI -w work_MPI -f -p mpiexec -n 16 ../model ww3_tp2.10
./bin/run_cmake_test -o both -S -s MPI -w work_MPI -f -p mpiexec -n 16 ../model ww3_tp2.16

This has been run on both Isambard and XCE. We ran comparisons using matrix.comp against develop for the pre-existing regressions tests, and nccmp -d comparisons (SMC vs SMC_GPU) for the new regressions tests. All of this produced identical output and as such is considered to be a pass for the regression tests.

Copy link
Member

@ukmo-ccbunney ukmo-ccbunney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking fine thanks Lewis.
Will leave this as an open PR for now until we decide how and where we will merge it.

* origin/develop:
  Enable doxygen documentation in the cmake build system (NOAA-EMC#1281)
  Simplify MPI ifdefs in subroutine W3MPIO (NOAA-EMC#1266)
  Add depth scaling value to SMC regression tests. (NOAA-EMC#1264)
  Updates to NCEP regtests for Orion Rocky9 OS(NOAA-EMC#1263)
  Fix code stability issue in ww3_outp (NOAA-EMC#1258)
  Fix GNU regtest CI failure (NOAA-EMC#1253)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

2 participants