Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install CUDA under host_injections #410

Merged

Conversation

casparvl
Copy link
Collaborator

@casparvl casparvl commented Dec 1, 2023

Replicates #368 for the 2023.06-pilot repo

Copy link

eessi-bot bot commented Dec 1, 2023

Instance eessi-bot-mc-aws is configured to build:

  • arch x86_64/generic for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/generic for repo eessi-hpc.org-2023.06-software
  • arch x86_64/generic for repo eessi.io-2023.06-compat
  • arch x86_64/generic for repo eessi.io-2023.06-software
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-software
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-software
  • arch aarch64/generic for repo eessi.io-2023.06-compat
  • arch aarch64/generic for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-software

@casparvl
Copy link
Collaborator Author

casparvl commented Dec 1, 2023

I believe all the changes for software.eessi.io have been made in this PR, but still need to test...

@casparvl casparvl changed the title Replicated changes that were initially targetted at 2023.06-pilot repo Build Cuda under host injections. Replicated changes that were initially targetted at 2023.06-pilot repo Dec 1, 2023
@casparvl casparvl changed the title Build Cuda under host injections. Replicated changes that were initially targetted at 2023.06-pilot repo Build Cuda under host injections. Replicates changes that were initially targetted at 2023.06-pilot repo Dec 1, 2023
…the required GPU components in host_injections. The EESSI-install-software.sh has been modified to run install_scripts.sh early on and then run the actual installed scripts to install a full cuda SDK and drivers. This should enable building and using CUDA software anywhere down the line in this same environment
@ocaisa
Copy link
Member

ocaisa commented Dec 19, 2023

This needs to actually be deployed, but the deployment is coming in #434 so I think it's ok to merge as is

@ocaisa ocaisa merged commit cc7d0e4 into EESSI:2023.06-software.eessi.io Dec 19, 2023
33 checks passed
@@ -187,6 +187,22 @@ fi
# assume there's only one diff file that corresponds to the PR patch file
pr_diff=$(ls [0-9]*.diff | head -1)

# install any additional required scripts
# order is important: these are needed to install a full CUDA SDK in host_injections
install_scripts_changed=$(cat ${pr_diff} | grep '^+++' | cut -f2 -d' ' | sed 's@^[a-z]/@@g' | grep '^install_scripts.sh$' > /dev/null; echo $?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@casparvl I messed up a little by merging this as now the file checked here will not be updated in the other PR, and as a result the scripts will not get installed. It doesn't matter too much as I think the use of $pr_diff is overkill here. I think we should always run the installer script and instead of using cp use cp -u, that could be implemented as part of #434

Also, this was a little wrong anyway, as it doesn't check if the scripts installed by the installer have changed

# Install full CUDA SDK in host_injections
# Hardcode this for now, see if it works
# TODO: We should make a nice yaml and loop over all CUDA versions in that yaml to figure out what to install
${EESSI_CVMFS_REPO}/gpu_support/nvidia/install_cuda_host_injections.sh 12.1.1
Copy link
Member

@ocaisa ocaisa Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I really messed up here...the arguments are incorrect!


# Copy files from this directory into the prefix
# To be on the safe side, we dont do recursive copies, but we are explicitely copying each individual file we want to add
for file in install_cuda_host_injections.sh link_nvidia_host_injections.sh; do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Script name is wrong here

@boegel boegel changed the title Build Cuda under host injections. Replicates changes that were initially targetted at 2023.06-pilot repo install CUDA under host_injections Dec 20, 2023
@boegel boegel mentioned this pull request Dec 20, 2023
5 tasks
@boegel boegel added the 2023.06-software.eessi.io 2023.06 version of software.eessi.io label Jan 11, 2024
TopRichard added a commit to TopRichard/bot-software-layer1 that referenced this pull request Jun 17, 2024
…mory_per_node_option_1

Use default memory option during reframe tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2023.06-software.eessi.io 2023.06 version of software.eessi.io accel:nvidia
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants