A Singularity container of the LFRic software stack built with an included Intel one API compiler.
It is based on Fedora and includes all of the software package dependencies and tools in the standard LFRic build environment but compiled with Intel fortran rather than gfortran, and gcc.
A compiler is not required on the build and run machine where the container is deployed. All compilation of LFRic is done via the containerised compilers.
LFRic components are built using a shell within the container and the shell automatically sets up the build environment when invoked.
The LFRic source code is not containerised, it is retrieved as usual via subversion from within the container shell so there is no need to rebuild the container for LFRic code updates.
The container is compatible with slurm, and the compiled executable can be run in batch using the local MPI libraries, if the host system has an MPICH ABI compatible MPI.
A pre-built container is available from Sylabs Cloud.
lfric_env.def is the Singularity definition file.
archer2_lfric.sub is an example ARCHER2 submission script and dirac_lfric.sub an example DiRAC HPC submission script.
Singularity 3.0+ (3.7 preferred)
Access to Met Office Science Repository Service
sudo
access if changes to the container are required. This can be on a different system from the LFRic build and run machine.
MPICH
compatible MPI on deployment system for use of local MPI libraries.
either:
- (Recommended) Download the latest version of the Singularity container from Sylabs Cloud Library.
singularity pull [--disable-cache] lfric_env.sif library://simonwncas/default/lfric_env:latest
Note: --disable-cache
is required if using Archer2.
or:
- Build container using
lfric_env.def
.
sudo singularity build lfric_env.sif lfric_env.def
Note: sudo
access required.
One time only. Edit (or create) ~/.subversion/servers
and add the following
[groups]
metofficesharedrepos = code*.metoffice.gov.uk
[metofficesharedrepos]
# Specify your Science Repository Service user name here
username = myusername
store-plaintext-passwords = no
Remember to replace myusername
with your MOSRS username.
On deployment machine.
singularity shell lfric_env.sif
Now, using the shell inside the container:
. mosrs-setup-gpg-agent
and enter your password when instructed. You may be asked twice.
fcm co https://code.metoffice.gov.uk/svn/lfric/LFRic/trunk trunk
fcm co https://code.metoffice.gov.uk/svn/lfric/GPL-utilities/trunk rose-picker
Due to licensing concerns, the rose-picker part of the LFRic configuration system is held as a separate project.
export ROSE_PICKER=$PWD/rose-picker
export PYTHONPATH=$ROSE_PICKER/lib/python:$PYTHONPATH
export PATH=$ROSE_PICKER/bin:$PATH
There are issues with the MPICH, the Intel compiler and Fortran08 C bindings, please see https://code.metoffice.gov.uk/trac/lfric/ticket/2273 for more information. Edit ifork.mk.
vi trunk/infrastructure/build/fortran/ifort.mk
and change the line
FFLAGS_WARNINGS = -warn all -warn errors
to
FFLAGS_WARNINGS = -warn all,noexternal -warn errors
Note: nano
is also available in the container environment.
cd trunk/gungho
make build [-j nproc]
cd trunk/lfric_atm
make build [-j nproc]
The executables are built using the Intel compiler and associated software stack within the container and written to the local filesystem.
cd example
mpiexec -np 6 ../bin/gungho configuration.nml
Note: This uses the MPI runtime libraries built into in the container.
cd example
../bin/lfric_atm configuration.nml
Note: This is the "single column" test version of lfric_atm. Running a global model requires the use of rose-stem
which is currently beyond the scope of this document.
If the host machine has a MPICH based MPI (MPICH, Intel MPI, Cray MPT, MVAPICH2), then MPICH ABI can be used to access the local MPI and therefore the fast interconnects when running the executable via the container. See the section below for full details. Two example slurm gungho submission scripts are provided, one for ARCHER2 and one for DiRAC.
Note: These scripts are submitted on the command line as usual and not from within the container.
This approach is a variation on the Singularity MPI Bind model. The compiled model executable is run within the container with suitable options to allow access to the local MPI installation. At runtime, containerised libraries are used by the executable apart from the local MPI libraries. OpenMPI will not work with this method.
Note: this only applies when a model is run, the executable is compiled using the method above, without any reference to local libraries.
A MPICH ABI compatible MPI is required. These have MPI libraries named libmpifort.so.12
and libmpi.so.12
. The location of these libraries varies from system to system. When logged directly onto the system, which mpif90
should show where the MPI binaries are located, and the MPI libraries will be in a directory ../lib
relative to this. On Cray systems the cray-mpich-abi
libraries are needed, which can are in /opt/cray/pe/mpich/8.0.16/ofi/gnu/9.1/lib-abi-mpich
or similar.
The local MPI libraries need to be made available to the container. Bind points are required so that containerised processes can access the local directories which contain the MPI libraries. Also the LD_LIBRARY_PATH
inside the container needs updating to reflect the path to the local libraries. This method has been tested for slurm, but should for other job control systems.
For example, assuming the system MPI libraries are in /opt/mpich/lib
, set the bind directory with
export BIND_OPT="-B /opt/mpich"
then for Singularity versions <3.7
export SINGULARITYENV_LOCAL_LD_LIBRARY_PATH=/opt/mpich/lib
for Singularity v3.7 and over
export LOCAL_LD_LIBRARY_PATH="/opt/mpich/lib:\$LD_LIBRARY_PATH"
The entries in BIND_OPT
are comma separated, while [SINGULARITYENV_LOCAL_]LD_LIBRARY_PATH
are colon separated.
For Singularity versions <3.7, the command to run gungho within MPI is now
singularity exec $BIND_OPT <sif-dir>/lfric_env.sif ../bin/gungho configuration.nml
for Singularity v3.7 and over
singularity exec $BIND_OPT --env=LD_LIBRARY_PATH=$LOCAL_LD_LIBRARY_PATH <sif-dir>/lfric_env.sif ../bin/gungho configuration.nml
Running with mpirun/slurm is straightforward, just use the standard command for running MPI jobs eg:
mpirun -n <NUMBER_OF_RANKS> singularity exec $BIND_OPT lfric_env.sif ../bin/gungho configuration.nml
or
srun --cpu-bind=cores singularity exec $BIND_OPT lfric_env.sif ../bin/gungho configuration.nml
on ARCHER2
If running with slurm, /var/spool/slurmd
should be appended to BIND_OPT
, separated with a comma.
It could be possible that the local MPI libraries have other dependencies which are in other system directories. In this case BIND_OPT
and [SINGULARITYENV_]LOCAL_LD_LIBRARY_PATH
have to be updated to reflect these. For example on ARCHER2 these are
export BIND_OPT="-B /opt/cray,/usr/lib64:/usr/lib/host,/var/spool/slurmd"
and
export SINGULARITYENV_LOCAL_LD_LIBRARY_PATH=/opt/cray/pe/mpich/8.0.16/ofi/gnu/9.1/lib-abi-mpich:/opt/cray/libfabric/1.11.0.0.233/lib64:/opt/cray/pe/pmi/6.0.7/lib
Discovering the missing dependencies is a process of trail and error where the executable is run via the container, and any missing libraries will cause an error and be reported. A suitable bind point and library path is then included in the above environment variables, and the process repeated.
/usr/lib/host
Is at the end of LD_LIBRARY_PATH
in the container, so that this bind point can be used to provide any remaining system libraries dependencies in standard locations. In the above example, there are extra dependencies in /usr/lib64
, so /usr/lib64:/usr/lib/host
in BIND_OPT
mounts this as /usr/lib/host
inside the container, and therefore /usr/lib64
is appended to the container's LD_LIBRARY_PATH
.