Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Use rapids-dependency-file-generator to generate pyproject.toml #10803

Open
hcho3 opened this issue Sep 4, 2024 · 4 comments
Open

[CI] Use rapids-dependency-file-generator to generate pyproject.toml #10803

hcho3 opened this issue Sep 4, 2024 · 4 comments

Comments

@hcho3
Copy link
Collaborator

hcho3 commented Sep 4, 2024

Currently, we use a patch to remove nvidia-nccl-cu12 from the dependencies when building the CPU wheel:

diff --git python-package/pyproject.toml python-package/pyproject.toml
index 20d3f9974..953087ff4 100644
--- python-package/pyproject.toml
+++ python-package/pyproject.toml
@@ -30,7 +30,6 @@ classifiers = [
dependencies = [
"numpy",
"scipy",
- "nvidia-nccl-cu12 ; platform_system == 'Linux' and platform_machine != 'aarch64'",
]
[project.urls]

This approach has several disadvantages:

  • Fragility: Future changes to the original file may break the patch.
  • Cannot accommodate multiple CUDA versions. The patch bakes in the assumption that we are only building wheels for CPU and CUDA 12 targets. What happens when we also want a wheel for CUDA 11 ?

Proposal. Use rapids-dependency-file-generator to modify the dependencies section in pyproject.toml.
The tool is standalone and lightweight, and can be easily integrated in our CI/CD pipelines.

The following configuration will let us generate three variants (CPU, CUDA 11, CUDA 12):

# dependencies.yaml

files:
  pyproject:
    output: pyproject
    pyproject_dir: python-package
    matrix:
      cuda: ["none", "11.8", "12.4"]
    includes:
      - deps
    extras:
      table: project
dependencies:
  deps:
    common:
      - output_types: pyproject
        packages:
          - numpy
          - scipy
    specific:
      - output_types: pyproject
        matrices:
          - matrix:
              cuda: "11.8"
            packages:
              - nvidia-nccl-cu11 ; platform_system == 'Linux' and platform_machine != 'aarch64'
          - matrix:
              cuda: "12.4"
            packages:
              - nvidia-nccl-cu12 ; platform_system == 'Linux' and platform_machine != 'aarch64'
          # Don't add NCCL dep for CPU package
          - matrix:
            packages:

Command: rapids-dependency-file-generator --file-key pyproject --output pyproject --matrix cuda=12.4

cc @jameslamb

@hcho3
Copy link
Collaborator Author

hcho3 commented Sep 4, 2024

How about re-naming of the package? (xgboost-cpu, xgboost-cu11, xgboost-cu12) Not sure if adopting rapids-build-backend makes sense. We can probably update the existing packaging logic in packager/ to handle the renaming.

@jameslamb
Copy link
Contributor

Thanks for the @.

To replace the patch, before involving rapids-dependency-file-generator, I might explore having some Python code in packager/ handle any CUDA-specific dependencies, maybe in the get_requires_for_*() methods here or similar:

get_requires_for_build_sdist = hatchling.build.get_requires_for_build_sdist
get_requires_for_build_wheel = hatchling.build.get_requires_for_build_wheel
get_requires_for_build_editable = hatchling.build.get_requires_for_build_editable

With a bit of Python code that optionally adds the appropriate nccl project to the [project].dependencies (based on a value passed through --config-settings or an environment variable or something).

That's similar to what rapids-build-backend does (code link).

rapids-build-backend has to be more generic (and therefore more complex) because it supports many different projects. But here in xgboost, with the packager code only for xgboost, it could be much much simpler and hard-coded to only the exact things xgboost needs.

How about re-naming of the package? (xgboost-cpu, xgboost-cu11, xgboost-cu12)

I think this should be avoided for as long as possible.

There is significant usage of the name xgboost out there already in many other libraries, build scripts, dockerfiles, etc.. It'd be a lot to ask projects like optuna, shap, pycaret, etc. to make similar changes and to have to confront this in how they manage their XGBoost dependency. And even if other library maintainers were ok with it, it'd be something new that has to be taught to XGBoost's many users (who have their own scripts and habits assuming that pip install xgboost will continue to work as they're used to).

Before making any decision, I recommend reading through these discussions:

I am still working through them myself... but there are a lot of ideas about these topics in there.


I hope that was helpful, sorry that it's a lot 😅 . @ me any time, happy to try to help more with this.

@hcho3
Copy link
Collaborator Author

hcho3 commented Sep 6, 2024

@jameslamb Can you look at my proposal at #10807? I tried to make it as least disruptive as possible.

@trivialfis
Copy link
Member

I think this should be avoided for as long as possible.

I agree.

With a bit of Python code that optionally adds the appropriate nccl project to the

Looks like a great idea.

Thank you for the detailed explanation. I will look into this as well. Going through the discussion, I think we can wait a little bit and see what PyPI will do. For now, conda-forge might be a better place to provide build variants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants