Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudaPackages: enable cross-compilation (take two) #279952

Closed

Conversation

ConnorBaker
Copy link
Contributor

@ConnorBaker ConnorBaker commented Jan 10, 2024

Important

This PR includes changes which have not yet been merged into master. It should not be merged prior to them:

Description of changes

#275560 brings in a bunch of additional work that I thought needed to be coupled with support for cross-compilation, but over the break I decided to revisit this. Maybe it doesn't need to be coupled -- this is a second attempt at enabled cross-compilation for cudaPackages.

Of note, both cuda-modules/setup-hooks/extension.nix and cuda-modules/cuda/overrides.nix have been refactored. They are now attribute sets of functions which are invoked with callPackage to create setup hooks and package overrides, respectively. The move to using final.callPackage rather than retrieving required packages directly from final allows us to use the __spliced attribute on derivations when cross-compiling to ensure we're selecting the correct version of a package.

When adding packages to nativeBuildInputs or buildInputs, splicing (largely) happens automatically. That is, there is no need to manually specify which splice a package should be drawn from. Additionally, the default package set the callPackage arguments are drawn from, pkgs (aka pkgsHostTarget), helps minimize breakage. However, there are places where we do need to specify the splice to use:

  • Using specific package outputs: splices must be manually inserted prior to using a particular output of a package, as otherwise the output will be drawn from the splice corresponding to pkgs (which is to say pkgsHostTarget).
  • Phases: any usage of a package supplied by nativeBuildInputs in pre/post phases should be spliced so the correct package is made available.
  • Flags: any usage of a package supplied by nativeBuildInputs in a *Flags-style argument should be spliced, again to ensure the correct package is made available and that the default (pkgs) splice isn't used instead.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@ConnorBaker ConnorBaker added the 6.topic: cuda Parallel computing platform and API label Jan 10, 2024
@ConnorBaker ConnorBaker self-assigned this Jan 10, 2024
@ofborg ofborg bot added the 6.topic: cross-compilation Building packages on a different platform than they will be used on label Jan 10, 2024
@ConnorBaker ConnorBaker changed the title cudaPackages: enable cross-compilation cudaPackages: enable cross-compilation (take two) Jan 10, 2024
@ConnorBaker

This comment was marked as outdated.

@yannham

This comment was marked as outdated.

@yannham

This comment was marked as outdated.

@SomeoneSerge

This comment was marked as outdated.

@ConnorBaker

This comment was marked as outdated.

@ConnorBaker

This comment was marked as outdated.

@ConnorBaker ConnorBaker force-pushed the feat/cudaPackages-cross-compilation branch 2 times, most recently from 7b5683f to 5a9d2c8 Compare January 23, 2024 03:36
@github-actions github-actions bot added the 6.topic: stdenv Standard environment label Jan 23, 2024
@ConnorBaker ConnorBaker force-pushed the feat/cudaPackages-cross-compilation branch from 5a9d2c8 to ecb9e96 Compare January 23, 2024 03:43
--replace \
'$(TOP)/$(_TARGET_DIR_)/include' \
"''${!outputDev}/include"
# TODO(@connorbaker): We should specify the spliced version of backendStdenv and cuda_cudart to use here.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 31 to 46
substitutions = {
# Required in addition to ccRoot as otherwise bin/gcc is looked up
# when building CMakeCUDACompilerId.cu
ccFullPath = "${backendStdenv.cc}/bin/${backendStdenv.cc.targetPrefix}c++";
# Point NVCC at a compatible compiler
ccRoot = "${backendStdenv.cc}";
setupCudaHook = placeholder "out";
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO @ConnorBaker

Are these correct? Do we need to use a spliced version? Unclear whether makeSetupHook does anything on the backend to draw these from buildPackages (pkgsBuildHost) or uses the default pkgs (pkgsHostTarget).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Thinking locally, yes they should be spliced. The hook is going to reside in the target derivation's nativeBuildInputs, so with splicing I expect the hook to be taken from the buildPackages. I expect the backendStdenv from that splice to contain a compiler for build->build instead of build->host.
  2. Thinking more globally, there are just so many places where we already hard-code the cc at evaluation/rendering time: setupCudaHook, nvcc, backendStdenv. This wrong, redundant, complex, and we better choose just one place

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the current host's compiler. Then we take the hook from the BuildHost slice, so its "host" is really our "build"

@ConnorBaker
Copy link
Contributor Author

@yannham current blockers are these:

Building cuda_nvcc will fail because the order of the outputs matters, and lib should not come after static as it will move all the libraries out of the static output and into the lib output, causing an empty static output and a build failure: #279952 (comment)

Fixing that error (either by commenting it out temporarily or inserting lib in the correct position) leads to a new build error when trying to compile legacyPackages.x86_64-linux.pkgsCross.aarch64-multiplatform.cudaPackages.saxpy -- cuda_nvcc from the Jetson (linux-aarch64 is the NVIDIA redist arch name) tarball being built on x86_64-linux fails to link against the libgcc and libc++ libraries from the buildPackages compiler (as expected). However, I have no idea why it's trying to do that.

Unclear whether we're sourcing compilers from the correct package sets in #279952 (comment) and #279952 (comment).

@SomeoneSerge
Copy link
Contributor

the order of the outputs matters, and lib should not come after static as it will move all the libraries out of the static output and into the lib output

That's a regression after the cuda12 fixes then, and we should PR a fix separately. I thought we'd just update the manifest generator and remove these outputs = ... ++ ... hacks

@@ -46,7 +46,7 @@ let
# redistArch :: String
# The redistArch is the name of the architecture for which the redistributable is built.
# It is `"unsupported"` if the redistributable is not supported on the target platform.
redistArch = flags.getRedistArch hostPlatform.system;
redistArch = flags.getRedistArch targetPlatform.system;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? I think we generally want the host system's binaries, except when we handle lib/ and nvvm/lib/ in nvcc (then we do want target)

@ConnorBaker

This comment was marked as outdated.

@ConnorBaker ConnorBaker force-pushed the feat/cudaPackages-cross-compilation branch from 2dbe0a5 to 2525205 Compare March 19, 2024 21:25
@ConnorBaker

This comment was marked as outdated.

@ConnorBaker
Copy link
Contributor Author

TODO(@ConnorBaker): NVCC, as well as any dependency which finds itself in nativeBuildInputs, cannot have a dependency on cuda_compat (usually introduced by way of one of our setup hooks) when we are cross-compiling.

Likely for ease of implementation, NVCC is used as a way to inject dependencies on setup hooks. If that's the case, it falls apart when cross-compiling, as our setup hooks need to be able to run on the build platform rather than the host/target.

Connor Baker added 18 commits March 26, 2024 13:58
Since, even under cross-compilation, we evaluate this flag on multiple platforms, it makes more sense to move the platform check out of the throw condition
and into the boolean return value. The alternative is to restrict all uses of this value to locations which gaurd evaluation so it does not occur when the
host platform is still x86_64.
@ConnorBaker ConnorBaker force-pushed the feat/cudaPackages-cross-compilation branch from 1a9f28b to 1ac7621 Compare March 26, 2024 13:58
@ConnorBaker
Copy link
Contributor Author

Closing as this PR is largely superseded by #301416.

@ConnorBaker ConnorBaker closed this Apr 4, 2024
@cole-h cole-h removed the ofborg-internal-error Ofborg encountered an error label Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: cross-compilation Building packages on a different platform than they will be used on 6.topic: cuda Parallel computing platform and API
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants