Skip to content

Commit

Permalink
Merge pull request #317 from mbareford/mbareford/mbareford-e1000-update
Browse files Browse the repository at this point in the history
E-1000 updates
  • Loading branch information
xguo-epcc authored Mar 11, 2024
2 parents 5b85ff1 + f9d00af commit 2fd9e6f
Show file tree
Hide file tree
Showing 6 changed files with 78 additions and 120 deletions.
26 changes: 13 additions & 13 deletions docs/software-tools/ddt.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Debugging using Arm DDT
# Debugging using Linaro DDT

The Arm Forge tool suite is installed on Cirrus. This includes DDT,
The Linaro Forge tool suite is installed on Cirrus. This includes DDT,
which is a debugging tool for scalar, multi-threaded and large-scale
parallel applications. To compile your code for debugging you will
usually want to specify the `-O0` option to turn off all code
optimisation (as this can produce a mismatch between source code line
numbers and debugging information) and `-g` to include debugging
information in the compiled executable. To use this package you will
need to log in to Cirrus with X11-forwarding enabled, load the Arm Forge
need to log in to Cirrus with X11-forwarding enabled, load the Linaro Forge
module and execute `forge`:

module load forge
Expand Down Expand Up @@ -37,7 +37,7 @@ tick the *MPI* box -- when running on the compute nodes, you must set
the MPI implementation to *Slurm (generic)*. You must also tick the
*Submit to Queue* box. Clicking the *Configure* button in this section,
you must now choose the submission template. One is provided for you at
`/mnt/lustre/indy2lfs/sw/arm/forge/latest/templates/cirrus.qtf` which
`/work/y07/shared/cirrus-software/forge/latest/templates/cirrus.qtf` which
you should copy and modify to suit your needs. You will need to load any
modules required for your code and perform any other necessary setup,
such as providing extra sbatch options, i.e., whatever is needed for
Expand All @@ -47,7 +47,7 @@ your code to run in a normal batch job.

!!! Note

The current Arm Forge licence permits use on the Cirrus CPU nodes only.
The current Linaro Forge licence permits use on the Cirrus CPU nodes only.
The licence does not permit use of DDT/MAP for codes that run on the
Cirrus GPUs.

Expand Down Expand Up @@ -77,18 +77,18 @@ libraries should be found without further arguments.

## Remote Client

Arm Forge can connect to remote systems using SSH so you can run the
Linaro Forge can connect to remote systems using SSH so you can run the
user interface on your desktop or laptop machine without the need for X
forwarding. Native remote clients are available for Windows, macOS and
Linux. You can download the remote clients from the [Arm
website](https://developer.arm.com/downloads/-/arm-forge). No licence
Linux. You can download the remote clients from the [Linaro Forge
website](https://www.linaroforge.com/downloadForge/). No licence
file is required by a remote client.



!!! Note

The same versions of Arm Forge must be installed on the local and remote
The same versions of Linaro Forge must be installed on the local and remote
systems in order to use DDT remotely.


Expand All @@ -98,7 +98,7 @@ click on the *Remote Launch* drop-down box and click on *Configure*. In
the new window, click *Add* to create a new login profile. For the
hostname you should provide `[email protected]` where
`username` is your login username. For *Remote Installation Directory*\*
enter `/mnt/lustre/indy2lfs/sw/arm/forge/latest`. To ensure your SSH
enter `/work/y07/shared/cirrus-software/forge/latest`. To ensure your SSH
private key can be used to connect, the SSH agent on your local machine
should be configured to provide it. You can ensure this by running
`ssh-add ~/.ssh/id_rsa_cirrus` before using the Forge client where you
Expand Down Expand Up @@ -127,11 +127,11 @@ usual login password the connection to Cirrus will be established and
you will be able to start debugging.

You can find more detailed information
[here](https://developer.arm.com/documentation/101136/2011/Arm-Forge/Connecting-to-a-remote-system).
[here](https://docs.linaroforge.com/23.1.1/html/forge/forge/connecting_to_a_remote_system/connecting_remotely.html).

## Getting further help on DDT

- [DDT
website](https://www.arm.com/products/development-tools/server-and-hpc/forge/ddt)
website](https://www.linaroforge.com/linaroDdt/)
- [DDT user
guide](https://developer.arm.com/documentation/101136/22-1-3/DDT?lang=en)
guide](https://docs.linaroforge.com/23.1.1/html/forge/ddt/index.html)
95 changes: 28 additions & 67 deletions docs/user-guide/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,10 @@ You can list all the modules of a particular type by providing an
argument to the `module avail` command. For example, to list all
available versions of the Intel Compiler type:

[user@cirrus-login0 ~]$ module avail intel-compilers
[user@cirrus-login0 ~]$ module avail intel-*/compilers

--------------------------------- /mnt/lustre/indy2lfs/sw/modulefiles --------------------------------
intel-compilers-18/18.05.274 intel-compilers-19/19.0.0.117
--------------------------------- /work/y07/shared/cirrus-modulefiles --------------------------------
intel-19.5/compilers intel-20.4/compilers

If you want more info on any of the modules, you can use the
`module help` command:
Expand All @@ -66,46 +66,37 @@ their versions you have presently loaded in your environment, e.g.:

[user@cirrus-login0 ~]$ module list
Currently Loaded Modulefiles:
1) git/2.35.1(default) 6) gcc/8.2.0(default)
2) singularity/3.7.2(default) 7) intel-cc-18/18.0.5.274
3) epcc/utils 8) intel-fc-18/18.0.5.274
4) /mnt/lustre/indy2lfs/sw/modulefiles/epcc/setup-env 9) intel-compilers-18/18.05.274
5) intel-license 10) mpt/2.25
1) git/2.35.1(default)
2) epcc/utils
2) /mnt/lustre/e1000/home/y07/shared/cirrus-modulefiles/epcc/setup-env

### Loading, unloading and swapping modules

To load a module to use `module add` or `module load`. For example, to
load the intel-compilers-18 into the development environment:
load the intel 20.4 compilers into the development environment:

module load intel-compilers-18
module load intel-20.4/compilers

This will load the default version of the intel compilers. If you need a
specific version of the module, you can add more information:

module load intel-compilers-18/18.0.5.274

will load version 18.0.2.274 for you, regardless of the default.
This will load the default version of the intel compilers.

If a module loading file cannot be accessed within 10 seconds, a warning
message will appear: `Warning: Module system not loaded`.

If you want to clean up, `module remove` will remove a loaded module:

module remove intel-compilers-18
module remove intel-20.4/compilers

(or `module rm intel-compilers-18` or
`module unload intel-compilers-18`) will unload what ever version of
intel-compilers-18 (even if it is not the default) you might have
loaded. There are many situations in which you might want to change the
You could also run `module rm intel-20.4/compilers` or `module unload intel-20.4/compilers`.
There are many situations in which you might want to change the
presently loaded version to a different one, such as trying the latest
version which is not yet the default or using a legacy version to keep
compatibility with old data. This can be achieved most easily by using
"module swap oldmodule newmodule".

Suppose you have loaded version 18 of the Intel compilers; the following
command will change to version 19:
Suppose you have loaded version 19 of the Intel compilers; the following
command will change to version 20:

module swap intel-compilers-18 intel-compilers-19
module swap intel-19.5/compilers intel-20.4/compilers

## Available Compiler Suites

Expand All @@ -119,11 +110,11 @@ command will change to version 19:

### Intel Compiler Suite

The Intel compiler suite is accessed by loading the `intel-compilers-*`
and `intel-*/compilers` modules, where `*` references the version. For
example, to load the 2019 release, you would run:
The Intel compiler suite is accessed by loading the `intel-*/compilers`
module, where `*` references the version. For example, to load the v20
release, you would run:

module load intel-compilers-19
module load intel-20.4/compilers

Once you have loaded the module, the compilers are available as:

Expand All @@ -137,10 +128,10 @@ compiler versions and tools.
### GCC Compiler Suite

The GCC compiler suite is accessed by loading the `gcc/*` modules, where
`*` again is the version. For example, to load version 8.2.0 you would
`*` again is the version. For example, to load version 10.2.0 you would
run:

module load gcc/8.2.0
module load gcc/10.2.0

Once you have loaded the module, the compilers are available as:

Expand Down Expand Up @@ -193,9 +184,9 @@ use to compile your code.
#### Using Intel Compilers and HPE MPT

Once you have loaded the MPT module you should next load the Intel
compilers module you intend to use (e.g. `intel-compilers-19`):
compilers module you intend to use (e.g. `intel-20.4/compilers`):

module load intel-compilers-19
module load intel-20.4/compilers

The compiler wrappers are then available as

Expand Down Expand Up @@ -243,9 +234,9 @@ Compilers are then available as
Although HPE MPT remains the default MPI library and we recommend that
first attempts at building code follow that route, you may also choose
to use Intel MPI if you wish. To use these, load the appropriate
`intel-mpi` module, for example `intel-mpi-19`:
MPI module, for example `intel-20.4/mpi`:

module load intel-mpi-19
module load intel-20.4/mpi

Please note that the name of the wrappers to use when compiling with
Intel MPI depends on whether you are using the Intel compilers or GCC.
Expand All @@ -260,40 +251,12 @@ building software.





!!! Note


Using Intel MPI 18 can cause warnings in your output similar to
`no hfi units are available` or
`The /dev/hfi1_0 device failed to appear`. These warnings can be safely
ignored, or, if you would prefer to prevent them, you may add the line

export I_MPI_FABRICS=shm:ofa

to your job scripts after loading the Intel MPI 18 module.





!!! Note



When using Intel MPI 18, you should always launch MPI tasks with `srun`,
the supported method on Cirrus. Launches with `mpirun` or `mpiexec` will
likely fail.



#### Using Intel Compilers and Intel MPI

After first loading Intel MPI, you should next load the appropriate
`intel-compilers` module (e.g. `intel-compilers-19`):
Intel compilers module (e.g. `intel-20.4/compilers`):

module load intel-compilers-19
module load intel-20.4/compilers

You may then use the following MPI compiler wrappers:

Expand Down Expand Up @@ -328,7 +291,7 @@ specific CUDA version, one that is fully compatible with the underlying
NVIDIA GPU device driver. See the link below for an example how an OpenMPI
build is configured.

[Build instructions for OpenMPI 4.1.5 on Cirrus](https://github.com/hpc-uk/build-instructions/blob/main/libs/openmpi/build_openmpi_4.1.5_cirrus_gcc8.md)
[Build instructions for OpenMPI 4.1.6 on Cirrus](https://github.com/hpc-uk/build-instructions/blob/main/libs/openmpi/build_openmpi_4.1.6_cirrus_gcc10.md)

All this means we build can OpenMPI such that it supports direct GPU-to-GPU communications
using the NVLink intra-node GPU comm links (and inter-node GPU comms are direct to Infiniband
Expand Down Expand Up @@ -442,8 +405,6 @@ A full list is available via `module avail intel`.

The different available compiler versions are:

- `intel-*/18.0.5.274` Intel 2018 Update 4
- `intel-*/19.0.0.117` Intel 2019 Initial release
- `intel-19.5/*` Intel 2019 Update 5
- `intel-20.4/*` Intel 2020 Update 4

Expand Down
26 changes: 12 additions & 14 deletions docs/user-guide/gpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,8 @@ are particular reasons to use earlier versions. The default version is
therefore the latest module version present on the system.

Each release of the NVIDIA HPC SDK may include several different
versions of the CUDA toolchain. For example, the `nvidia/nvhpc/21.2`
module comes with CUDA 10.2, 11.0 and 11.2. Only one of these CUDA
toolchains can be active at any one time and for `nvhpc/22.11` this is
CUDA 11.8.
versions of the CUDA toolchain. Only one of these CUDA toolchains
can be active at any one time and for `nvhpc/22.11` this is CUDA 11.8.

Here is a list of available HPC SDK versions, and the corresponding
version of CUDA:
Expand Down Expand Up @@ -88,11 +86,11 @@ Compile your source code in the usual way.

#### Using CUDA with Intel compilers

You can load either the Intel 18 or Intel 19 compilers to use with
You can load either the Intel 19 or Intel 20 compilers to use with
`nvcc`.

module unload gcc
module load intel-compilers-19
module load intel-20.4/compilers

You can now use `nvcc -ccbin icpc` to compile your source code with the
Intel C++ compiler `icpc`.
Expand Down Expand Up @@ -401,8 +399,8 @@ via the links below.

<https://docs.nvidia.com/nsight-systems/UserGuide/index.html>

If your code was compiled with the tools provided by `nvidia/nvhpc/21.2`
you should download and install Nsight Systems v2020.5.1.85.
If your code was compiled with the tools provided by `nvidia/nvhpc/22.2`
you should download and install Nsight Systems v2023.4.1.97.

### Using Nsight Compute

Expand Down Expand Up @@ -437,10 +435,10 @@ Consult the NVIDIA documentation for further details.

<https://developer.nvidia.com/nsight-compute>

<https://docs.nvidia.com/nsight-compute/2021.2/index.html>
<https://docs.nvidia.com/nsight-compute/2023.3/index.html>

Nsight Compute v2021.3.1.0 has been found to work for codes compiled
using `nvhpc` versions 21.2 and 21.9.
Nsight Compute v2023.3.1.0 has been found to work for codes compiled
using `nvhpc` versions 22.2 and 22.11.


## Monitoring the GPU Power Usage
Expand Down Expand Up @@ -526,8 +524,8 @@ bandwidth.
Version of OpenMPI with both CUDA-aware MPI support and SLURM support
are available, you should load the following modules:

module load openmpi/4.1.4-cuda-11.8
module load nvidia/nvhpc-nompi/22.11
module load openmpi/4.1.6-cuda-11.6
module load nvidia/nvhpc-nompi/22.2

The command you use to compile depends on whether you are compiling
C/C++ or Fortran.
Expand Down Expand Up @@ -563,7 +561,7 @@ A batch script to use such an executable might be:
#SBATCH --gres=gpu:4

# Load the appropriate modules, e.g.,
module load openmpi/4.1.4-cuda-11.8
module load openmpi/4.1.6-cuda-11.6
module load nvidia/nvhpc-nompi/22.2

export OMP_NUM_THREADS=1
Expand Down
Loading

0 comments on commit 2fd9e6f

Please sign in to comment.