PyTorch ecosystem: identical packages with different names + differing package versions with same name #11012

tmm1 · 2025-01-28T01:57:24Z

Question

Following up on #10694 with some more real-world examples.

triton vs pytorch-triton

Many pytorch ecosystem packages depend on triton, so pyproject/uv.lock will often contain triton == 3.1.0.

However, when switching to pytorch nightly the equivalent nightly package for triton is named pytorch-triton: https://github.com/pytorch/pytorch/blob/main/RELEASE.md#triton-dependency-for-the-release

So the equivalent entry to use nightly is something like pytorch-triton == 3.2.0+git0d4682f0. But adding this to your project doesn't remove the transitive triton == 3.1.0 so it seems both get installed:

❯ ls -d .venv/lib/python3.11/site-packages/*triton*
.venv/lib/python3.11/site-packages/pytorch_triton-3.2.0+git0d4682f0.dist-info 
.venv/lib/python3.11/site-packages/triton-3.1.0.dist-info
.venv/lib/python3.11/site-packages/triton

and since they both write files to site-packages/triton its not fully clear which version ends up getting used.

flash-attn v2 vs v3

There are two version of flash-attn available, both named flash-attn but with differing version numbers. In a regular python pip environment, it seems possible to install them both:

$ pip list | grep flash
flash-attn                  2.7.2.post1
flash-attn                  3.0.0b1

In an uv env, installing one replaces the other:

$ uv pip install git+https://github.com/Dao-AILab/flash-attention#subdirectory=hopper --no-build-isolation

$ uv pip install git+https://github.com/Dao-AILab/flash-attention --no-build-isolation
Resolved 25 packages in 146ms
Uninstalled 1 package in 39ms
Installed 1 package in 197ms
 - flash-attn==3.0.0b1 (from git+https://github.com/Dao-AILab/flash-attention@6b1d059eda21c1bd421f3d352786fca2cab61954#subdirectory=hopper)
 + flash-attn==2.7.3 (from git+https://github.com/Dao-AILab/flash-attention@6b1d059eda21c1bd421f3d352786fca2cab61954)

The replacement behavior is probably what makes the most sense, but currently it is necessary to have both installed to actually use FA3: Dao-AILab/flash-attention#1467

Platform

Linux amd64

Version

uv 0.5.22

The text was updated successfully, but these errors were encountered:

tmm1 · 2025-01-28T01:59:22Z

tl;dr my questions are:

is it possible to install two versions of the same package
is it possible to replace one package with another package of a different name

charliermarsh · 2025-01-28T02:21:08Z

The replacement behavior is probably what makes the most sense, but currently it is necessary to have both installed to actually use FA3.

Unfortunately, no. There's no support in the Python runtime for having multiple versions of a single package installed. It's not possible. (You might be interested in Armin's multiversion from a decade ago as a guide on what breaks.)

tmm1 · 2025-01-29T18:09:16Z

In this case the fa2 and fa3 packages are completely separate and have no overlapping files. Installing them side-by-side is expected and necessary to use tools like tritonbench or even to experiment with both in transformers.

Is there any way to force uv to do so? Or to alias names so it doesn't think they're the same package?

Similiarly for the triton package, I'm wondering if there's a way to mask it out? i.e. To specific it should not be installed at all (because the pytorch-triton nightly build is newer and does have overlapping files).

I also run into this issue where it isn't available in newer pythons:

error: Distribution `triton==3.1.0 @ registry+https://pypi.org/simple` can't be installed because it doesn't have a source distribution or wheel for the current platform

hint: You're using CPython 3.13 (`cp313`), but  `triton` (v3.1.0) only has wheels with the following Python ABI tags: `cp311`, `cp312`

tmm1 · 2025-01-29T18:11:44Z

if there's a way to mask it out

found #9174 (comment) and #4422 (comment)

charliermarsh · 2025-01-29T18:13:16Z

Is there anything you can link me to, to understand why you're supposed to install two flash-attention versions at the same time, or even that it's a recommended approach?

tmm1 · 2025-01-29T19:38:48Z

I don't think it was necessarily designed that way.. but to practically use these packages right now, it is necessary.

Basically, a lot of the sources in the flash-attn repo were copied into a ./hopper/ subdirectory, the version was updated to be 3.0b1, all the modules removed, and then the cuda kernels were modified.

So this 3.0 beta package just contains new cuda code for updated GPUs and some basic interface python code.

To effectively use the cuda code, you need the rest of the supporting library, which is still only published in the flash-attn 2.x packages. So i.e. if you need from flash_attn.bert_padding import pad_input to prepare your inputs for the fa3 kernel, it doesn't work in the uv venv

Another example is torchtitan, which benchmarks both approaches. It calls pip install -e . in both the main repo and the hopper subdirectory.

EDIT: related Dao-AILab/flash-attention#1457

As a workaround, I guess the simplest thing would be fork Dao-AILab/flash-attention, change hopper/setup.py so the package name is unique, then use that new git url as my dependency.

tmm1 added the question Asking for clarification or support label Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch ecosystem: identical packages with different names + differing package versions with same name #11012

PyTorch ecosystem: identical packages with different names + differing package versions with same name #11012

tmm1 commented Jan 28, 2025

tmm1 commented Jan 28, 2025

charliermarsh commented Jan 28, 2025

tmm1 commented Jan 29, 2025

tmm1 commented Jan 29, 2025

charliermarsh commented Jan 29, 2025

tmm1 commented Jan 29, 2025 •

edited

Loading

PyTorch ecosystem: identical packages with different names + differing package versions with same name #11012

PyTorch ecosystem: identical packages with different names + differing package versions with same name #11012

Comments

tmm1 commented Jan 28, 2025

Question

triton vs pytorch-triton

flash-attn v2 vs v3

Platform

Version

tmm1 commented Jan 28, 2025

charliermarsh commented Jan 28, 2025

tmm1 commented Jan 29, 2025

tmm1 commented Jan 29, 2025

charliermarsh commented Jan 29, 2025

tmm1 commented Jan 29, 2025 • edited Loading

tmm1 commented Jan 29, 2025 •

edited

Loading