-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CSD3] Update configurations and Spack environments #247
[CSD3] Update configurations and Spack environments #247
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested them myself, but the changes look good, thanks a lot!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note, I wish we could automate the generation of the input file, but at the moment we don't know the memory available on the compute nodes (it could be an extra custom property of partitions, if we wanted to), but there's also the problem that "looking up an ever-changing table on the Intel website" isn't exactly my idea of automation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it would be nice to do this but don't think there's a good solution...
- spec: [email protected]%intel+mpi^intel-mpi | ||
- spec: [email protected] +mpi^intel-mpi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume the point of %intel
was not to allow using this with other libraries built with gcc, I may have had troubles trying this combination
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for some reason Spack refused to used this external package (when using the intel
compiler and intel-mpi
) until I made this change. I didn't really understand it since it didn't seem to be a problem for any other external packages with compilers in their specs 🤷
Most of the Spack compilers and external packages are RHEL/Rocky Linux/Scientific Linux 7 (for the Cascade Lakes) and have issues on the Icelakes (running Rocky Linux 8) hence why this resets the environment to a clean slate.
Since we have two different OSs on CSD3, CentOS 7 and Rocky Linux 8, it is important to only use software built for the correct OS in order to avoid issues.
This adds some external Spack packages provided by the OS and specifies a preferred compiler and MPI implementation.
Note that there is currently a common software stack for the Ice Lakes and Sapphire Rapids and both partitions use the same OS (Rocky Linux 8).
Also allow the concretizer to target sapphirerapids when building on the icelake login nodes.
Set the environment variables explicitly rather than using the rhel?/default-* modules so as to avoid any environment pollution from default modules.
Always request all of the memory on a node as `--exclusive` does not imply this according to the current SLURM documentation (https://slurm.schedmd.com/sbatch.html#OPT_exclusive). This can make a difference to benchmark results when not using all of the cores on a node due to the increased memory bandwidth available. Also increase the job_submit_timeout as the SLURM controller can be a bit slow on CSD3.
Note that these values are when it is built with the Intel Classic Compiler which is preferred in the Spack environments.
7c18005
to
28b7173
Compare
I've rebased the changes onto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please sort alphabetically packages in the spack configurations? That makes easier to see what's in there.
Also remove [email protected] from cascadelake environment as it is deprecated.
It would be nice if |
Though this can improve performance, it can also lead to problems when submitting some jobs so leave it up to the user to add this if they want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment about indentation, but rest looks good, thanks again!
It would be nice if spack external find did this automatically.
Agreed!
Co-authored-by: Mosè Giordano <[email protected]>
This PR updates the ReFrame configurations for CSD3 and the corresponding Spack environments. Furthermore it also adds the new Sapphire Rapids partition.
Some key changes are explained below.
Operating System
On CSD3, we have 2 different OSs:
The Cascade Lake partition run on CentOS 7 and all other partitions (i.e. Ice Lakes and Sapphire Rapids) run on Rocky Linux 8. In order to avoid bugs and library incompatibilities, it is important to use the software stack built for the correct OS. One of the main problems with the current
csd3-icelake
Spack environment is that there are lots of external packages built for CentOS 7/RHEL 7 and as a result I experienced segfaults on the example Sombrero benchmark with the default Spack spec. I have therefore completely recreated it from scratch and only included packages built for Rocky Linux 8.In order to stress the different OSs to users of this repo on CSD3, I have renamed the existing systems as follows
csd3-cascadelake/compute-node
->csd3-centos7/cascadelake
csd3-icelake/compute-node
->csd3-rocky8/icelake
Note that there are two sets of login nodes
login-p-[0-4]
that are running Cent OS 7 andlogin-q-[0-4]
that are running Rocky Linux 8.Sapphire Rapids Partition
This currently uses the same software stack and OS as the Ice Lakes hence the common
compilers.yaml
andpackages.yaml
for the Spack environment. The only difference in the Spack environments is the specified target.The Sapphire Rapids partition has been added as a new partition (in the ReFrame sense) under the
csd3-rocky8
system.Preferred compilers and MPI implementation
I have set the preferred compiler to
intel
for all Spack environments and the preferred MPI implementation tointel-mpi
(for CentOS 7) orintel-oneapi-mpi
(for Rocky Linux 8). These are both in the default modules that are loaded on CSD3 and we generally observe better performance than withgcc
/openmpi
.Reference values
I have updated some of the existing reference values for the benchmarks.
I have tested this PR with the following benchmarks:
There are still a few minor changes I wish to make before asking for a review hence why I'm making this a draft PR for now.