Shim packages #61

jaimergp · 2021-07-08T17:17:08Z

I read the conda-forge documentation and could not find the solution for my problem there.

We all know that cudatoolkit packages are huge and cause a lot of unneeded traffic, specially in GPU-less CI settings where it wouldn't work anyway, or HPC systems that provide their own CUDA installation. I'd say it'd be beneficial to have empty cudatoolkit shim packages that are empty and only contain the required metadata. They would be deprioritized with respect to the real package, much in the line of conda-forge/mpich-feedstock#44.

Interested users would specify cudatoolkit=*=external or similar in the package list. Alternatively/Additionally, the empty packages could be uploaded to a different label and interested people would add that label with higher priority.

Would this be something we'd agree on?

The text was updated successfully, but these errors were encountered:

jakirkham · 2021-07-20T18:48:48Z

I'm not sure.

In issue ( conda-forge/conda-forge.github.io#687 ) we had agreed with the broader community on aligning around using cudatoolkit packages and having CUDA versions selected by them. This is something all parties agreed to and do today. If we were to change that, would like to see the same parties way in.

While the issue of download and decompression pain does come up periodically, my hope would be better split up packages ( #62 ) would cutdown on this pain. Additionally would hope that generally moving to the newer .conda packages would cutdown on the download/decompression pain (by using zip files that have random access and newer compressors like Zstd which work well with filesystems like SquashFS). Also some of this pain can just be due to using the wrong filesystem on a cluster (like NFS), which just involves better engagement with system admins (though maybe we can document this better to help people realize this potential issue sooner).

If even after all of that we still wanted to do something like this, I think the question becomes where can we reliably expect external libraries to live on users' systems? If there is no reliable place, how do we specify it? Then following that, how do we support this functionality in packages using CUDA libraries without needing to add another build for external vs. internal?

jaimergp · 2021-07-21T08:38:34Z

how do we support this functionality in packages using CUDA libraries without needing to add another build for external vs. internal?

Is an extra build needed? I thought adding cudatoolkit=*=*external to the environment file or package list would suffice for the solver to pick the shim version. After that, it's on the user if they deliberately chose to opt out of the full cudatoolkit download.

jakirkham · 2021-07-26T19:05:50Z

If we can do it without an extra build, agree that is more useful. Would much rather people make a runtime decision. That said, Idk if people will generally configure this correctly, so it might need to be an optimization they apply themselves.

jaimergp · 2021-07-27T09:35:42Z

I envision this as an advanced piece of knowledge users in CI and HPC centers will use only after realizing they don't like the full download, so they will Google this issue and find out that they can do the *external trick (or find it mentioned in the documentation).

With CUDA 11.3+ it will be less of a problem I guess, but still some libraries are well into the hundreds anyway. We can add this only for <=11.2?

dicta · 2021-07-28T22:55:01Z

One of the build problems this will cause is when building downstream packages using CMake, the CMAKE_FIND_ROOT_PATH variable is used in conda-build environments to restrict finding any header or library to those in the environment. Other build systems may do something similar, I'm not nearly as familiar with them so can't comment directly.

So there are some real compatibility problems to be solved so that you don't break conda-build infrastructure for anyone who installed a cudatoolkit=*=external package.

IMO this should be thought about at the global conda-forge policy level -- and not just for CUDA as a special case -- so that we consistently deal with any environment that may need to depend on software provided by the system. My personal preference for cases like this is for any generated "external" metapackage to do something like the following:

Generate symlinks from the system locations into the environment for each file that's part of the external package. This may involve variants of the metapackage for different OS distributions, but this cannot be helped and it's better to do that here, in one place, than to have every downstream package that depends on us have to deal with it.
Test at environment activation time (likely using an activation script) whether all the targets of these symlinks exist, so the user has some amount of trust that their environment will actually work.

jaimergp · 2021-07-30T08:59:22Z

AFAIK this variant would only be relevant at runtime, not during build. Actually, our nvcc wrapper points to the global CUDA installation at /usr/local/cuda, so I don't see how the shims would affect this process (but I am not exactly an expert here and I am sure I am missing something).

The intention here is to provide empty packages that only carry the metadata. The effect would be the same as letting conda install cudatoolkit and then force-remove it (which some people have to do now in their setup for different reasons; mainly file size), but without having to download anything.

jakirkham · 2021-07-30T19:39:04Z

With CUDA 11.3+ it will be less of a problem I guess, but still some libraries are well into the hundreds anyway. We can add this only for <=11.2?

The move to 11.3+ already seems pretty fraught. Would rather not add one more thing to the list

jakirkham · 2021-07-30T19:46:12Z

More generally am interested in getting the new packages squared away ( #62 ). This will help us solve a lot of things including adding new CUDA versions, better splitting, providing access to build tooling, etc. We can also take that opportunity to leverage CUDA enhanced compatibility and lighten our build matrices.

jakirkham · 2021-07-30T19:49:58Z

Should add that for lightening deployments we can look at using nvprune to target things more carefully and cutdown on size. This is a little tricky to do in the general case (like how we build in conda-forge), but maybe we can envision final build steps that do this kind of splitting either at the end of a CI job or on a user's machine. There are pros & cons to this though that we would want to evaluate carefully

ngam · 2022-06-08T00:02:59Z

We all know that cudatoolkit packages are huge and cause a lot of unneeded traffic, specially in GPU-less CI settings where it wouldn't work anyway, or HPC systems that provide their own CUDA installation. I'd say it'd be beneficial to have empty cudatoolkit shim packages that are empty and only contain the required metadata.

While I think this is true in general and that it could be beneficial to add such a shim package, I would like to add one point in support of the current solution as described here:

In issue ( conda-forge/conda-forge.github.io#687 ) we had agreed with the broader community on aligning around using cudatoolkit packages and having CUDA versions selected by them. This is something all parties agreed to and do today. If we were to change that, would like to see the same parties way in.

There are potential downsides to having users choose between a conda-forge cudatoolkit and a system cudatoolkit. From my experience with HPC admins, most of them know little about cuda stuff and don't care to know about it. For one, they wouldn't update the stuff unless they are asked (as cudatoolkit and co are not "core" items to them). Even crazier, a lot of these system packages are incomplete --- let me just say, in 2022, I had to request explicitly for the sys admins to include ptxas in their cudatoolkit installation (11.6). I also had access to another HPC which had cudatoolkit 11.4, but no cudnn whatsoever (not 8.x, not 7.x). So, the bottom line is with such a weird package (can we even call it that?!) like cudatoolkit, I really think for the sake of the community and ecosystem, we should try to keep things simpler and not more complicated.

A power/pro user can of course choose to customize stuff as they wish, and they don't need conda-forge to figure it out for them.

Another aspect of all of this is performance. For tensorflow, I consistently find that the conda-forge installation outperforms any other installation (e.g. outperforming pypi by a wide margin, even outperforming the official ngc containers). Could we guarantee that the performance will not suffer if we allow for an external cudatoolkit? I suspect we cannot.

wesfloyd · 2022-07-29T14:01:25Z

Adding one additional benefit in support of this issue (enabling an "openmm install without cuda"), building docker images with OpenMM grow in size quickly due to the large cuda package size (800mb'ish).

jakirkham · 2022-07-29T15:32:39Z

There is a separate issue ( #48 ) around splitting into smaller packages, which can be depended on separately. Also issue ( #62 ) discusses a newer package structure that implements this.

daveminh · 2022-10-14T15:25:40Z

I see this is labeled as complete, but can somebody clarify how to use this?

jaimergp mentioned this issue Jul 20, 2021

Add shims for external installations #63

Closed

5 tasks

jaimergp mentioned this issue Aug 3, 2021

Openmm without cudatoolkit dependency? openmm/openmm#3059

Open

jaimergp mentioned this issue Jul 22, 2022

Using system versions of cuda, cudatoolkit, cudnn #81

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shim packages #61

Shim packages #61

jaimergp commented Jul 8, 2021

jakirkham commented Jul 20, 2021

jaimergp commented Jul 21, 2021

jakirkham commented Jul 26, 2021

jaimergp commented Jul 27, 2021

dicta commented Jul 28, 2021

jaimergp commented Jul 30, 2021

jakirkham commented Jul 30, 2021

jakirkham commented Jul 30, 2021

jakirkham commented Jul 30, 2021

ngam commented Jun 8, 2022 •

edited

Loading

wesfloyd commented Jul 29, 2022

jakirkham commented Jul 29, 2022

daveminh commented Oct 14, 2022

Shim packages #61

Shim packages #61

Comments

jaimergp commented Jul 8, 2021

jakirkham commented Jul 20, 2021

jaimergp commented Jul 21, 2021

jakirkham commented Jul 26, 2021

jaimergp commented Jul 27, 2021

dicta commented Jul 28, 2021

jaimergp commented Jul 30, 2021

jakirkham commented Jul 30, 2021

jakirkham commented Jul 30, 2021

jakirkham commented Jul 30, 2021

ngam commented Jun 8, 2022 • edited Loading

wesfloyd commented Jul 29, 2022

jakirkham commented Jul 29, 2022

daveminh commented Oct 14, 2022

ngam commented Jun 8, 2022 •

edited

Loading