Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fb smc gpuport #39

Open
wants to merge 28 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
918f14a
Initial preparations for GPU ported SMC grid configuration.
UKMO-lsampson Mar 23, 2023
33f9341
Testing changes to compiler and make
UKMO-lsampson Mar 24, 2023
4ff8b46
Merge branch 'develop' into fb_smc_gpuport
UKMO-lsampson Mar 24, 2023
4b5d7e0
Fixed #elif condition missing W3_GPU
UKMO-lsampson Mar 24, 2023
db2d4ab
Optimised the SMC propagation component. Producing valid model output
UKMO-lsampson Mar 30, 2023
6be6b7f
Added comments for the addition of GPU switches.
UKMO-lsampson Mar 31, 2023
b8543e2
Updating OpenACC directives to be more optimal and avoid some USE sta…
UKMO-lsampson Apr 6, 2023
3d8043f
Added SMC propagation GPU port comments to the manual.
UKMO-lsampson Apr 12, 2023
e381413
Fixed invalid pointer dereference.
UKMO-lsampson Apr 12, 2023
d6ade1b
Updated manual comments
UKMO-lsampson Apr 12, 2023
fb0ef44
Merge branch 'fb_smc_gpuport' of vld240:/home/h05/lsampson/WW3 into f…
UKMO-lsampson Apr 12, 2023
45754ab
Cleaned up miscellaneous code changes before final PR.
UKMO-lsampson Apr 12, 2023
d4879f2
Whitespace
UKMO-lsampson Apr 12, 2023
9a663ee
w3_make is no longer part of the regular build process.
UKMO-lsampson Apr 12, 2023
12b8f9f
Removal of duplicate lines
UKMO-lsampson Apr 12, 2023
d7bf35a
Added in the GPU switches for SMC regression tests.
UKMO-lsampson Apr 12, 2023
ab26d12
Corrected issue with w3initmd USE statements for ACC.
UKMO-lsampson Apr 13, 2023
3d34d7e
Merge branch 'NOAA-EMC:develop' into fb_smc_gpuport
ukmo-ccbunney May 2, 2023
cce026b
Merge branch 'NOAA-EMC:develop' into fb_smc_gpuport
ukmo-ccbunney May 11, 2023
e0dc022
Merge branch 'NOAA-EMC:develop' into fb_smc_gpuport
ukmo-ccbunney Jun 16, 2023
40169af
Merge branch 'NOAA-EMC:develop' into fb_smc_gpuport
ukmo-ccbunney Jul 28, 2023
268128b
Merge branch 'NOAA-EMC:develop' into fb_smc_gpuport
ukmo-ccbunney Nov 10, 2023
dd27155
Merge branch 'NOAA-EMC:develop' into fb_smc_gpuport
ukmo-ccbunney Nov 29, 2023
5f26f71
Merge remote-tracking branch 'origin/develop' into fb_smc_gpuport
ukmo-ccbunney Jan 5, 2024
2296559
Merge branch 'develop' into fb_smc_gpuport
ukmo-ccbunney Apr 12, 2024
0278793
Merge branch 'develop' into fb_smc_gpuport
ukmo-ccbunney May 13, 2024
0ec57ae
Merge branch 'develop' into fb_smc_gpuport
ukmo-ccbunney Jun 18, 2024
5519a58
Merge remote-tracking branch 'origin/develop' into fb_smc_gpuport
ukmo-ccbunney Sep 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions manual/num/space_SMC.tex
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,35 @@ \subsubsection{~Spherical Multiple-Cell (SMC) grid} \label{sub:num_space_SMC}
combined hybrid and multi-grid parallelization may extend the computer usage
to over 100 nodes for the 3 Great Lake sub-grids in \emph{mww3\_test\_09}.

Following the ongoing manual porting efforts at the Met Office, a switch has been
created for using an initial OpenACC parallelism of the SMC grid. This converts the
w3psmcmd.F90 module file and function calls to be able to target a GPU for
acceleration. Primarily this has been used with the nvfortran compiler to success
after being built on Isambard using the \emph{cmake\_build\_isambard.sh}
script with Met Office specifications. Contact Chris Bunney
(\url{[email protected]}) for details surrounding the Isambard
Implementation.

For optimal performance on GPU there is a range of changes to function calls,
array declarations and nested subroutine calls, which are all managed by the
switch. Since the GPU will deallocate arrays once they leave scope, local arrays
are hoisted to be in the module scope of w3wavemd.F90, and hence resident on
the GPU for longer. For GPU parallelism using OpenACC, data transfers and
parallelism specifications are applied implicitly where possible. For ease of
application, the SMC propagation subroutines are inlined so that the implicit
optimisations are correctly defined and maintain valid model output without more
intrusive coding changes.

The current implementation of the GPU switch has some limitations, this switch
only targets the multi-resolution grids and has not yet been adapted to the case
of \emph{NRLv .EQ. 1}. This expansion would not be difficult but requires further
inlining and similar changes to the code, which have not yet been tested properly.
It is also worth noting that the performance of the current implementation is not
at its full potential. This is due to the majority of the code still being
processed on the CPU, and only a small section actively ported to the GPU. We are
viewing the progress so far as a proof of concept and an initial step on determing
the best way to integrate GPU acceleration into the current parallelisation options.

It is recommended to read the smc\_docs/SMC\_Grid\_Guide.pdf or the
conference paper \citep{tol:LiS17} at conference web page:
http://www.waveworkshop.org/15thWaves/
Expand Down
1 change: 1 addition & 0 deletions model/bin/switch_UKMO_GPU
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
SHRD SMC UNO PR2 RTD FLX0 LN1 ST0 NL1 BT1 IC0 IS0 REF0 DB1 TR0 BS0 WNT1 WNX1 CRT1 CRX1 RWND NOGRB GPU
32 changes: 32 additions & 0 deletions model/src/w3initmd.F90
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,21 @@ SUBROUTINE W3INIT ( IMOD, IsMulti, FEXT, MDS, MTRACE, ODAT, FLGRD, FLGR2, FLGD,
#endif
#ifdef W3_UOST
USE W3UOSTMD, ONLY: UOST_SETGRID
#endif
#ifdef W3_GPU
USE W3GDATMD, ONLY: NK, NTH, DTH, XFR, ESIN, ECOS, SIG, NX, NY, &
NSEA, SX, SY, MAPSF, FUNO3, FVERG, &
IJKCel, IJKUFc, IJKVFc, NCel, NUFc, NVFc, &
IJKCel3, IJKCel4, &
IJKVFc5, IJKVFc6,IJKUFc5,IJKUFc6, &
NLvCel, NLvUFc, NLvVFc, NRLv, MRFct, &
DTCFL, CLATS, DTMS, CTRNX, CTRNY
USE W3GDATMD, ONLY: NGLO, ANGARC, ARCTC, CLATF
USE W3ADATMD, ONLY: CG, WN, U10, CX, CY, ATRNX, ATRNY, ITIME
!
USE W3IDATMD, ONLY: FLCUR
USE W3ODATMD, ONLY: NDSE, NDST, FLBPI, NBI, TBPI0, TBPIN, &
ISBPI, BBPI0, BBPIN
#endif
!/
#ifdef W3_MPI
Expand Down Expand Up @@ -1515,6 +1530,23 @@ SUBROUTINE W3INIT ( IMOD, IsMulti, FEXT, MDS, MTRACE, ODAT, FLGRD, FLGR2, FLGD,
#endif
!
! 8. Final MPI set up ----------------------------------------------- /
#ifdef W3_GPU
!/LS SMC Grid - GDAT
!$ACC ENTER DATA COPYIN(NK, NTH, DTH, XFR, ESIN, ECOS, SIG, NX, NY) &
!$ACC COPYIN(NSEA, SX, SY, MAPSF, FUNO3, FVERG, IJKCel ) &
!$ACC COPYIN(IJKUFc, IJKVFc, NCel, NUFc, NVFc, IJKCel3 ) &
!$ACC COPYIN(IJKCel4, IJKVFc5, IJKVFc6,IJKUFc5,IJKUFc6 ) &
!$ACC COPYIN(NLvCel, NLvUFc, NLvVFc, NRLv, MRFct, DTCFL) &
!$ACC COPYIN(CLATS, DTMS, CTRNX, CTRNY, NGLO, ANGARC ) &
!$ACC COPYIN(ARCTC, CLATF)
!/LS SMC Grid - ADAT
!$ACC ENTER DATA COPYIN(CG, WN, U10, CX, CY, ATRNX, ATRNY, ITIME)
!/LS SMC Grid - IDAT
!$ACC ENTER DATA COPYIN(FLCUR)
!/LS SMC Grid - ODAT
!$ACC ENTER DATA COPYIN(NDSE, NDST, FLBPI, NBI, TBPI0, TBPIN)&
!$ACC COPYIN(ISBPI, BBPI0, BBPIN)
#endif
!
#ifdef W3_MPI
CALL W3MPII ( IMOD )
Expand Down
Loading
Loading