Skip to content

Cornelis Omni Path Express build instructions

Thomas Huber edited this page Jul 10, 2024 · 3 revisions

Building an Installing Sandia-OpenSHMEM (SOS) on Cornelis Omni-Path Express Fabric:

OFI libfabric

Make sure to use the OPX provider of libfabric OFI libfabric.

$ ./configure --prefix=<libfabric_install_dir> --enable-opx --disable-psm2 --disable-sockets --disable-verbs --disable-usnic --disable-udp --disable-rxm --disable-rxd

where <libfabric_install_dir> is an appropriate installation path (the --disable-* flags are optional, but may help to assure SOS chooses the OPX provider).

Building SOS to use OPX libfabric provider

The following configure options must be utilized to build SOS with the OPX libfabric provider

$ ./configure --enable-pmi-simple --enable-av-map --enable-completion-polling --enable-thread-completion --enable-error-checking --enable-hard-polling --disable-rpath --prefix=<SOS_install_dir> --with-ofi=<libfabric_install_dir> --enable-manual-progress
$ make
$ make install

where <SOS_install_dir> is an appropriate installation path.

Compiling and running OpenSHMEM programs

The SOS build should be added to your path. If you used the build instructions posted here, this is done by running the following command:

$ export PATH=<SOS_install_dir>/bin:$PATH

Users must utilize the following environment variables to successfully run SOS with OPX

$ FI_OPX_VERSION=1 FI_OPX_UUID=$RANDOM SHMEM_OFI_PROVIDER=opx SHMEM_OFI_STX_MAX=0 OMP_NUM_THREADS=1

The OMP_NUM_THREADS can be adjusted, but be aware that SOS will utilize 2 contexts per endpoint created, so launching additional threads can easily cause resource exhaustion.

Once SOS is in your path, you can use the compiler wrapper oshcc to compile your application and the launcher wrapper oshrun to run it. Please assure you have a compatible launcher, such as Hydra 3.2 or newer.

Troubleshooting

Please refer to the Troubleshooting wiki page if you encounter any issues. If your particular problem is not covered in the wiki, please submit an issue through Github. If you are having issues with the OPX lifabric component, please reach out to [email protected] for help.