Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build CUDA under host_injections and make EESSI aware of host CUDA drivers #368

Merged

Conversation

ocaisa
Copy link
Member

@ocaisa ocaisa commented Oct 18, 2023

No description provided.

@eessi-bot
Copy link

eessi-bot bot commented Oct 18, 2023

Instance eessi-bot-mc-aws is configured to build:

  • arch x86_64/generic for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/generic for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-software

@ocaisa
Copy link
Member Author

ocaisa commented Oct 19, 2023

This can be tested by checking out the branch, starting the build container from the PR and running, e.g.,

source /cvmfs/pilot.eessi-hpc.org/versions/2023.06/init/bash 
./gpu_support/nvidia/install_cuda_host_injections.sh 12.1.1

@ocaisa
Copy link
Member Author

ocaisa commented Oct 19, 2023

But I need to transition the PR to (also?) use eessi_container.sh

build_container.sh Outdated Show resolved Hide resolved
@ocaisa ocaisa mentioned this pull request Oct 27, 2023
5 tasks
@ocaisa
Copy link
Member Author

ocaisa commented Oct 31, 2023

This PR can be tested without GPUs, use

./eessi_container.sh -m shell --nvidia install -v

to start the build container with the necessary settings, then you can install CUDA (under host_injections) with

source /cvmfs/pilot.eessi-hpc.org/versions/2023.06/init/bash 
./gpu_support/nvidia/install_cuda_host_injections.sh 12.1.1

@trz42 There are some sizable changes to the build container in the PR so you may wish to take it for a test drive.

@ocaisa
Copy link
Member Author

ocaisa commented Nov 9, 2023

@trz42 I need some advice here. I want to probably always trigger the ability to install CUDA, but I only want to trigger the host_injections build of CUDA when it is actually required (or we do it always but reuse the same space so it only actually gets done once)

@boegel boegel changed the base branch from 2023.06 to pilot.eessi-hpc.org-2023.06 November 21, 2023 21:19
gpu_support/nvidia/install_cuda_host_injections.sh Outdated Show resolved Hide resolved
gpu_support/nvidia/install_cuda_host_injections.sh Outdated Show resolved Hide resolved
gpu_support/nvidia/install_cuda_host_injections.sh Outdated Show resolved Hide resolved
gpu_support/nvidia/install_cuda_host_injections.sh Outdated Show resolved Hide resolved
gpu_support/nvidia/install_cuda_host_injections.sh Outdated Show resolved Hide resolved
scripts/utils.sh Show resolved Hide resolved
scripts/utils.sh Outdated Show resolved Hide resolved
eessi_container.sh Show resolved Hide resolved
@ocaisa
Copy link
Member Author

ocaisa commented Nov 30, 2023

@casparvl I've addressed your review as much as I can and I've added better command line options, including a critical acceptance of the CUDA EULA. This should be read for showtime now

@ocaisa
Copy link
Member Author

ocaisa commented Nov 30, 2023

I think I'll fold the script to the linking to the host drivers in here as well...working on it now

@ocaisa ocaisa changed the title Build CUDA under host_injections Build CUDA under host_injections and make EESSI aware of host CUDA drivers Nov 30, 2023
Copy link
Collaborator

@casparvl casparvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm! Awaiting rerun of a final CI test that crapped out because of rate limiting.

@casparvl casparvl merged commit 4e59a22 into EESSI:2023.06-pilot.eessi-hpc.org Dec 1, 2023
34 checks passed
@ocaisa ocaisa deleted the host_injections_cuda branch December 1, 2023 14:48
@boegel boegel added the 2023.06-software.eessi.io 2023.06 version of software.eessi.io label Jan 11, 2024
trz42 pushed a commit to trz42/software-layer that referenced this pull request Jun 11, 2024
…4.1-gompi/2023a

{2023.06}[gompi/2023a] BLAST+ V2.14.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2023.06-software.eessi.io 2023.06 version of software.eessi.io accel:nvidia
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants