Fb reg gpuport #40

UKMO-lsampson · 2023-04-13T13:39:22Z

Pull Request Summary

A section (regular grid spatial propagation) of the manual GPU porting efforts at the Met Office has been brought into the WW3 operational framework for testing and understanding the compatibility and flexibility of using OpenACC parallelism.

Description

The PR includes the successful merged port for the regular grid propagation routines, which are enabled via a new GPU switch. This activates the OpenACC directives that are required for performant GPU acceleration. For regular CPU compilations this has no affect, and unless compiled on a GPU architecture (such as Isambard) with the correct compiler flags, will perform as standard WW3.

We have used a separate build system on Isambard so that the GPU's can be targeted and tested as required. These are currently available under the /projects/metoffice/WW3_Isambard directory structure on Isambard.

Commit Message

Integration of the regular grid propagation from manual GPU porting efforts at the Met Office into the WW3 operational framework for testing and understanding the compatibility and flexibility of using OpenACC parallelism.

Check list

Branch is up to date with the authoritative repository (ukmo-waves) develop branch.
Relative regression tests have been run. Including additional GPU regression tests run on Isambard and XCE.

Testing

How were these changes tested? ...
Are the changes covered by regression tests? CPU changes are covered, added GPU switches to current regression tests.
Have the matrix regression tests been run (if yes, please note HPC and compiler)? N
Please indicate the expected changes in the regression test output, (Note the list of known non-identical tests.) No expected changes.
Please provide the summary output of matrix.comp (matrix.Diff.txt, matrixCompFull.txt and matrixCompSummary.txt): [TBC] matrix.comp provides identical output for the original regressions tests, and the new regressions test produce netcdf output which is identical to the tests with the GPU switch, according to nccmp. (On XCE)

UKMO-lsampson · 2023-04-13T14:43:46Z

The performance of the GPU versions for the code has not been fully analysed as there is known areas in which we will see drastic improvement, e.g. more resident GPU code (Source term, Intra-spectral routines), data optimisations.

This has also only been explicitly tested with managed memory, the explicit transfers compilation of the code is more complex and will contain more intrusive OpenACC directives.

Regressions tests run:

./bin/run_cmake_test -o both -S -s PR2_UNO_MPI -w work_PR2_UNO_MPI -f -p mpiexec -n 16 ../model ww3_tp2.1
./bin/run_cmake_test -o both -S -s PR2_UNO_MPI -w work_PR2_UNO_MPI -f -p mpiexec -n 16 ../model ww3_tp2.2
./bin/run_cmake_test -o both -S -s PR2_UNO_MPI -w work_PR2_UNO_MPI -f -p mpiexec -n 16 ../model ww3_tp2.3
./bin/run_cmake_test -o both -S -s PR2_UNO_MPI -w work_PR2_UNO_MPI -f -p mpiexec -n 16 ../model ww3_tp2.4
./bin/run_cmake_test -o both -S -s PR2_UNO_MPI_GPU -w work_PR2_UNO_MPI_GPU -f -p mpiexec -n 16 ../model ww3_tp2.1
./bin/run_cmake_test -o both -S -s PR2_UNO_MPI_GPU -w work_PR2_UNO_MPI_GPU -f -p mpiexec -n 16 ../model ww3_tp2.2
./bin/run_cmake_test -o both -S -s PR2_UNO_MPI_GPU -w work_PR2_UNO_MPI_GPU -f -p mpiexec -n 16 ../model ww3_tp2.3
./bin/run_cmake_test -o both -S -s PR2_UNO_MPI_GPU -w work_PR2_UNO_MPI_GPU -f -p mpiexec -n 16 ../model ww3_tp2.4

This has been run on both Isambard and XCE. We ran comparisons using matrix.comp against develop for the pre-existing regressions tests, and nccmp -d comparisons (original vs GPU) for the new regressions tests. All of this produced identical output and as such is considered to be a pass for the regression tests.

The port we have included is not optimised for the performance of the regression tests, instead the aim has been to mimic the GPU acceleration that our manual port uses. This means several sections are being run sequentially that could be accelerated and there is some data optimisations that we have not included, however, this will take further research to be functional. We have chosen specific regressions tests as the regular propagation typically activates a very large group of tests that are not necessary for our current experiments.

ukmo-ccbunney

This is looking fine thanks Lewis.
As with the SMC PR, I will leave this open for now until we decide how and where we will merge it.

* origin/develop: Enable doxygen documentation in the cmake build system (NOAA-EMC#1281) Simplify MPI ifdefs in subroutine W3MPIO (NOAA-EMC#1266) Add depth scaling value to SMC regression tests. (NOAA-EMC#1264) Updates to NCEP regtests for Orion Rocky9 OS(NOAA-EMC#1263) Fix code stability issue in ww3_outp (NOAA-EMC#1258) Fix GNU regtest CI failure (NOAA-EMC#1253)

UKMO-lsampson added 4 commits April 13, 2023 09:32

Initial regular grid GPU porting effort.

178e48d

Added switches for GPU regression tests

35455e6

Added forth regression test switch file

e455978

Updated matrix.base for regular grid propagation GPU tests

736b679

UKMO-lsampson requested a review from ukmo-ccbunney April 14, 2023 07:19

UKMO-lsampson added the enhancement New feature or request label Apr 14, 2023

ukmo-ccbunney reviewed Apr 19, 2023

View reviewed changes

ukmo-ccbunney and others added 11 commits May 2, 2023 09:34

Merge branch 'NOAA-EMC:develop' into fb_reg_gpuport

29e9315

Merge branch 'NOAA-EMC:develop' into fb_reg_gpuport

d1ca545

Merge branch 'NOAA-EMC:develop' into fb_reg_gpuport

5c536c3

Merge branch 'NOAA-EMC:develop' into fb_reg_gpuport

fea89a1

Merge branch 'NOAA-EMC:develop' into fb_reg_gpuport

90b4afa

Merge branch 'NOAA-EMC:develop' into fb_reg_gpuport

9ebcd5a

Merge remote-tracking branch 'origin/develop' into fb_reg_gpuport

56fc31c

Merge branch 'develop' into fb_reg_gpuport

d4f17e9

Merge branch 'develop' into fb_reg_gpuport

7b8e1c4

Merge branch 'develop' into fb_reg_gpuport

abe1a4b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fb reg gpuport #40

Fb reg gpuport #40

UKMO-lsampson commented Apr 13, 2023

UKMO-lsampson commented Apr 13, 2023

ukmo-ccbunney left a comment

Fb reg gpuport #40

Are you sure you want to change the base?

Fb reg gpuport #40

Conversation

UKMO-lsampson commented Apr 13, 2023

Pull Request Summary

Description

Commit Message

Check list

Testing

UKMO-lsampson commented Apr 13, 2023

ukmo-ccbunney left a comment

Choose a reason for hiding this comment