cudaPackages: enable cross-compilation #275560

ConnorBaker · 2023-12-20T04:17:43Z

Description of changes

Switches to makeScopeWithSplicing' to attempt to support cross-compilation
Pushes additional configuration data (like gpus.nix and nvcc-compatibilities.nix) into the module system, allowing us to define its structure
Module system handles simple data loading and transformation necessary for us to produce package sets from manifests or releases.nix expressions
Allows users to override certain configuration options through use of the module system (see the module evaluation performed in pkgs/top-level/cuda-packages.nix
Result of module system evaluation of data is available as the config attribute in each CUDA package set
Instead of evaluating the modules once for each group of packages (CUDA redist, CuTensor, CuDNN, TensorRT, etc.) and CUDA version, the modules are now evaluated once and re-used by each CUDA package set
Recurse into all CUDA package sets by default
Additional cleanups and refactoring

To-do:

Ensure cross-compilation works as expected
If one squints, the genericManifest wrapper in pkgs/top-level/cuda-packages.nix is very similar to the multiplexed builder -- investigate further
Better describe dynamic platforms and our use of badPlatforms as a way of getting around not being able to express combinations of target CUDA version and GPU capabilities as distinct platforms
Update documentation to describe what belongs in the module system (raw data like manifest files, releases.nix expressions, paths to fixup functions or shims) and what belongs in pkgs/top-level/cuda-packages.nix (actual package set construction)

Things done

Add a 👍 reaction to pull requests you find important.

… of data and construction with modules

SomeoneSerge · 2023-12-20T18:35:53Z

Are the extra moves, renames, and changes in the manifests (including releases.nix &c) required?

SomeoneSerge · 2023-12-20T19:41:47Z

The part reducing the number of evalModules calls can be probably moved out into its own PR, and I'd imagine it'll have fewer blockers than splicing or modifications to the top-level config (well I guess, techincally, this PR doesn't declare anything in there, but still)

ConnorBaker · 2023-12-21T09:55:03Z

Are the extra moves, renames, and changes in the manifests (including releases.nix &c) required?

The files can stay where they are, but the changes to the releases.nix expressions are needed (the un-nesting of the attribute sets).

I moved them because I felt it made more sense for them to be in the same directory as the thing which is reading and processing them (same reason why gpus.nix and nvcc-compatibilities.nix were moved).

The part reducing the number of evalModules calls can be probably moved out into its own PR, and I'd imagine it'll have fewer blockers than splicing or modifications to the top-level config (well I guess, techincally, this PR doesn't declare anything in there, but still)

I'm actually not sure it can be without larger changes -- part of the reason I'm able to evaluate it only once is because all-packages.nix now uses callPackage once on cuda-packages.nix and then inherits the different versions of the CUDA package sets from it.

That's only possible now because cuda-packages.nix generates all of the different versions of the package set. Without those changes, we'd still be using callPackage to create each version of the CUDA package set, which would mean evaluating the modules once for each version we create.

That and the fact that without the other changes in the PR, it's still being evaluated by each set of packages (that is, the CUDA redistributables, CuDNN, CuTensor, and TensorRT all evaluate the modules without the changes in this PR).

SomeoneSerge · 2023-12-21T15:00:45Z

That's only possible now because cuda-packages.nix generates all of the different versions of the package set. Without those changes, we'd still be using callPackage to create each version of the CUDA package set, which would mean evaluating the modules once for each version we create.

I probably like your solution better, but I thought (now I'm unsure) Nix would just cache the evalModules and reuse it between different cudaPackages as long as it was invoked with the exact same static arguments? E.g.

# cuda-packages.nix:
# ...
  final: {
    extraModules = [];
    # all-modules.nix doesn't depend on final or on cudaVersion,
    # although we do create a thunk until we've computed final.extraModules
    metadata = (evalModules { modules = [ ../pkgs/cuda-modules/all-modules.nix ] ++ final.extraModules; }).config
    # ...
  }
# all-packages.nix:
  # now cudaPacckages_11_8.metadata contains information about all of the known cuda releases, not just 11.8;
  # that's probably weird, but does this affect performance?
  cudaPackages_11_8 = callPackage ...;

Introducing top-level cudaVersions would also be consistent with like boostVersions and what else we've got.

I moved them because I felt it made more sense for them to be in the same directory as the thing which is reading and processing them (same reason why gpus.nix and nvcc-compatibilities.nix were moved)

That's probably OK to do (I haven't looked in detail), and even to do it now, but I'd ask you to submit it in a separate PR as soon as you think that particular change is ready. It's a refactoring which we here risk conflating with other potentially breaking changes?

A bit of reflection on the previous big change follows. Looking at the title, I expect this PR (in its final state) to be about enabling splicing. That sounds relatively contained, and probably shouldn't affect the native builds at all. The current diff clearly contains more. I'd guess this to mean that enabling splicing actually depends on other tasks which involve moving things around and regenerating the manifests. And it's only natural to start by solving all of these tasks in the same branch, because you can't look into the future and see if you're going to encounter more obstacles that'd make you change your mind about the previous decisions. However, as we get closer to merging, we should assign more value to disentangling the individual changes, and the earlier the better. Solving multiple tasks at once, this patch now potentially affects everything. Just like we had to assume for the multiplatform PR (we expected it'd break something, we couldn't predict what). If those tasks and refactors went in separate PRs, there'd be a chance that each individual one would end up "small enough" (the new manifests affect every package, but in a homogeneous way because it's a change of the same "type" everywhere). We also could merge those dependencies earlier, which means that we could "test" them in the unstable channels before you have finished working on the actual cross-compilation feature. To conclude, I'd absolutely hate to block you or interrupt your flow, but I think there are benefits to having split things up by the time of merging, and to reviewing and merging dependencies earlier

ConnorBaker · 2024-04-04T02:30:15Z

Closing as superseded by #301416.

Anything outside the scope of cross-compilation lives on in my out-of-tree copy of cuda-modules, where I am working on moving more evaluation into the module system to avoid infinite recursion errors by essentially forcing two stages of evaluation: https://github.com/ConnorBaker/cuda-modules.

Connor Baker added 5 commits December 20, 2023 03:49

cuda-modules: wip refactor to allow user overrides; better separation…

458c824

… of data and construction with modules

cuda-modules: wip refactor for cross-compilation

1a71265

cuda-modules: wip refactor for cross-compilation

b8be361

cuda-modules: wip refactor for cross-compilation

49953cc

cuda-modules: wip refactor for cross-compilation

91f7634

ConnorBaker added the 6.topic: cuda Parallel computing platform and API label Dec 20, 2023

ConnorBaker self-assigned this Dec 20, 2023

ofborg bot added the 6.topic: cross-compilation Building packages on a different platform than they will be used on label Dec 20, 2023

ConnorBaker mentioned this pull request Jan 10, 2024

cudaPackages: enable cross-compilation (take two) #279952

Closed

13 tasks

wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Mar 20, 2024

ConnorBaker mentioned this pull request Apr 4, 2024

cuda-modules: apply lessons learned from cross-compilation attempts #301416

Merged

17 tasks

ConnorBaker closed this Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cudaPackages: enable cross-compilation #275560

cudaPackages: enable cross-compilation #275560

ConnorBaker commented Dec 20, 2023 •

edited

Loading

SomeoneSerge commented Dec 20, 2023

SomeoneSerge commented Dec 20, 2023

ConnorBaker commented Dec 21, 2023

SomeoneSerge commented Dec 21, 2023 •

edited

Loading

ConnorBaker commented Apr 4, 2024

cudaPackages: enable cross-compilation #275560

cudaPackages: enable cross-compilation #275560

Conversation

ConnorBaker commented Dec 20, 2023 • edited Loading

Description of changes

Things done

SomeoneSerge commented Dec 20, 2023

SomeoneSerge commented Dec 20, 2023

ConnorBaker commented Dec 21, 2023

SomeoneSerge commented Dec 21, 2023 • edited Loading

ConnorBaker commented Apr 4, 2024

ConnorBaker commented Dec 20, 2023 •

edited

Loading

SomeoneSerge commented Dec 21, 2023 •

edited

Loading