flash-attn v2.6.3 + TORCH_CUDA_ARCH_LIST=8.0;8.6;8.9;9.0+PTX #10

regro-cf-autotick-bot · 2024-07-24T00:16:50Z

It is very likely that the current package version for this feedstock is out of date.

Checklist before merging this PR:

Dependencies have been updated if changed: see upstream
Tests have passed
Updated license if changed and license_file is packaged

Information about this PR:

Feel free to push to the bot's branch to update this PR if needed.
The bot will almost always only open one PR per version.
The bot will stop issuing PRs if more than 3 version bump PRs generated by the bot are open. If you don't want to package a particular version please close the PR.
If you want these PRs to be merged automatically, make an issue with @conda-forge-admin,please add bot automerge in the title and merge the resulting PR. This command will add our bot automerge feature to your feedstock.
If this PR was opened in error or needs to be updated please add the bot-rerun label to this PR. The bot will close this PR and schedule another one. If you do not have permissions to add this label, you can use the phrase @conda-forge-admin, please rerun bot in a PR comment to have the conda-forge-admin add it for you.

Pending Dependency Version Updates

Here is a list of all the pending dependency version updates for this repo. Please double check all dependencies before merging.

Name	Upstream Version	Current Version
pytorch-cpu	131547

Dependency Analysis

We couldn't run dependency analysis due to an internal error in the bot, depfinder, or grayskull. :/ Help is very welcome!

_{This PR was created by the regro-cf-autotick-bot. The regro-cf-autotick-bot is a service to automatically track the dependency graph, migrate packages, and propose package version updates for conda-forge. Feel free to drop us a line if there are any issues! This PR was generated by - please use this URL for debugging.}

Closes #11
Closes #17

…nda-forge-pinning 2024.07.23.22.35.38

conda-forge-webservices · 2024-07-24T00:16:56Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

Xref https://github.com/Dao-AILab/flash-attention/blame/v2.6.2/setup.py

carterbox · 2024-07-24T15:42:43Z

Looks like we are exceeding the 6 hour limit even when building for only a single arch. Looks like we need to finish the process to get on the Quantsight GPU server.

@weiji14, could you please agree to the terms of use for the Quantsight GPU severs by opening a pull request here: https://github.com/Quansight/open-gpu-server

Then we can open a request to have the bot enable the Quantsight GPU server for this feedstock using this template: https://github.com/conda-forge/admin-requests/blob/main/examples/example-open-gpu-server.yml

Ref: https://conda-forge.org/docs/maintainer/knowledge_base/#packages-that-require-a-gpu-or-long-running-builds

weiji14 · 2024-07-25T01:32:17Z

Sure, let me work on that later, recent builds have been too close to the 6hr mark and I've had to restart CI for the past two releases already, and would be good to handle more CUDA archs (xref #4). Not sure if we actually need a GPU-enabled server though, maybe just a longer timeout (+more CPU/RAM)?

carterbox · 2024-07-25T15:03:23Z

Not sure if we actually need a GPU-enabled server though, maybe just a longer timeout (+more CPU/RAM)?

The only way to accomplish builds longer than 6 hours is to use Quantsight's servers. I'm not sure what options Quantsight provides for servers with/without GPUs, but this package is a CUDA enabled package, so we should just use the GPU server list.

…nda-forge-pinning 2024.07.25.23.42.20

…nda-forge-pinning 2024.07.26.18.03.15

carterbox · 2024-07-29T17:18:53Z

TODO:

Enable other build archs xref: Build on OpenStack server for TORCH_CUDA_ARCH_LIST=8.0;8.6;8.9;9.0+PTX #4

carterbox · 2024-07-29T19:02:52Z

On the openstack runners, the jobs are crashing during the first compilation phase. Trying to reduce the number of concurrent compilations to 2 in order to prevent this crash (from a max of 4).

carterbox · 2024-07-29T19:50:53Z

Maybe they're not crashing? Maybe the jobs are just being killed at 30mins?

carterbox · 2024-07-30T18:11:05Z

@conda-forge-admin, please rerender

…nda-forge-pinning 2024.07.30.01.49.09

Try to fix `torch 2.4.1.post300 requires setuptools, which is not installed`.

…nda-forge-pinning 2024.10.12.13.15.35

conda-forge-admin · 2024-10-13T11:29:13Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

weiji14 · 2024-10-13T11:36:19Z

Thanks @jakirkham for the merge tips 🙏

The previous test build with CUDA 11.8 / Python 3.12 seems to have completed successfully on cirun-openstack-cpu-xlarge (32GB RAM) with MAX_JOBS=4 at commit 5878e8b in 8h6m. See logs at https://github.com/conda-forge/flash-attn-feedstock/actions/runs/11246603140/job/31367673505

I'm now testing CUDA 12.0 / Python 3.13 (cherry-picking from #17) at commit e4e877b, and if this builds complete within the time limit, I think we should have enough info on what resources will be needed for the full matrix builds.

weiji14

Ok, are we ready for the full matrix builds?

recipe/meta.yaml

.github/workflows/conda-build.yml

carterbox · 2024-10-15T16:12:09Z

~~Sounds good to me!~~ 8 hours per build seems reasonable.

Co-authored-by: Wei Ji <[email protected]>

carterbox · 2024-10-15T16:16:58Z

@conda-forge-admin, please rerender

carterbox

Wait. Setuptools is not a runtime dependency of this package. Pytorch needs to fix its missing/extra dependency, not us.

…nda-forge-pinning 2024.10.15.15.47.23

carterbox · 2024-10-15T16:22:25Z

@conda-forge-admin, please rerender

…nda-forge-pinning 2024.10.15.15.47.23

Co-authored-by: Wei Ji <[email protected]>

carterbox · 2024-10-15T16:27:32Z

@conda-forge-admin, please rerender

…nda-forge-pinning 2024.10.15.15.47.23

jakirkham · 2024-10-15T16:57:00Z

Thanks Daniel! 🙏

Looks like cirun doesn't like the bot commit. So may need to push another commit

Unsure why rerender at 3051209 reverted to 2160 mins.

…nda-forge-pinning 2024.10.15.17.49.24

weiji14 · 2024-10-15T18:14:03Z

Ok, re-rendered manually using conda smithy rerender -c auto. I've set the timeout back to 9 hours.

weiji14

Builds all pass now 🎉 Longest one was CUDA 11.8 / Python 3.12 that took 8h 26m 41s (under the 9h time limit).

Shall we wait for conda-forge/pytorch-cpu-feedstock#276, and then we can include the Python 3.13 builds here too?

weiji14 · 2024-10-16T03:19:21Z

recipe/meta.yaml

  skip: true  # [cuda_compiler_version in (undefined, "None")]
  skip: true  # [not linux]
+  skip: true  # [py==313]  # Skip until pytorch dependency on setuptools is fixed


Remove after conda-forge/pytorch-cpu-feedstock#276 is merged (remember to re-render manually afterwards).

Suggested change

skip: true # [py==313] # Skip until pytorch dependency on setuptools is fixed

We have to rebuild to add those additional modules requested in #18. We can see if pytorch is ready for 3.13 then.

jakirkham · 2024-10-16T05:03:35Z

Hooray! 🥳

Excellent work 😄 Thank you both! 🙏

regro-cf-autotick-bot added 2 commits July 24, 2024 00:15

updated v2.6.2

4179071

MNT: Re-rendered with conda-build 24.5.1, conda-smithy 3.37.1, and co…

b58d8e1

…nda-forge-pinning 2024.07.23.22.35.38

regro-cf-autotick-bot requested review from carterbox and weiji14 as code owners July 24, 2024 00:16

Update setup.py for flash-attention v2.6.2

89c5d54

Xref https://github.com/Dao-AILab/flash-attention/blame/v2.6.2/setup.py

regro-cf-autotick-bot added 2 commits July 26, 2024 04:11

updated v2.6.3

7a7e321

MNT: Re-rendered with conda-build 24.5.1, conda-smithy 3.37.1, and co…

d57121f

…nda-forge-pinning 2024.07.25.23.42.20

regro-cf-autotick-bot mentioned this pull request Jul 26, 2024

flash-attn v2.6.3 #11

Closed

3 tasks

weiji14 mentioned this pull request Jul 26, 2024

Request large CPU/GPU runners for flash-attn conda-forge/admin-requests#1040

Merged

3 tasks

automatic conda-forge administrator and others added 4 commits July 29, 2024 16:59

Enable cirun-openstack-gpu-large, cirun-openstack-cpu-large using Cirun

7dda024

MNT: Re-rendered with conda-build 24.5.1, conda-smithy 3.37.1, and co…

49756ac

…nda-forge-pinning 2024.07.26.18.03.15

Merge branch 'cirun-1722272355' into 2.6.2_h62eaee

58a39f9

Merge branch '2.6.3_h7bf43b' into 2.6.2_h62eaee

738fe60

carterbox added 2 commits July 29, 2024 12:24

BLD: Enable all compatible CUDA arch targets

6be2fd7

BLD: Limit MAX_JOBS to prevent runner crash

0419cf9

Limit MAX_JOBS to 1

7b3a38a

carterbox added the automerge Merge the PR when CI passes label Jul 29, 2024

CI: Increase timeout to 36 hours

494edc3

MNT: Re-rendered with conda-build 24.5.1, conda-smithy 3.37.1, and co…

248c4eb

…nda-forge-pinning 2024.07.30.01.49.09

carterbox added automerge Merge the PR when CI passes and removed automerge Merge the PR when CI passes labels Jul 30, 2024

Small non-bot commit to trigger runners

686b454

regro-cf-autotick-bot and others added 4 commits October 14, 2024 00:23

Rebuild for python 3.13

91bc5d0

Add setuptools as runtime dependency

96ed564

Try to fix `torch 2.4.1.post300 requires setuptools, which is not installed`.

Test build on CUDA 12.0 and Python 3.13

7391de7

MNT: Re-rendered with conda-build 24.9.0, conda-smithy 3.42.1, and co…

e4e877b

…nda-forge-pinning 2024.10.12.13.15.35

weiji14 changed the title ~~flash-attn v2.6.3 + TORCH_CUDA_ARCH_LIST=8.0;8.6;8.9;9.0+PTX~~ flash-attn v2.6.3 + python 3.13 + TORCH_CUDA_ARCH_LIST=8.0;8.6;8.9;9.0+PTX Oct 13, 2024

weiji14 mentioned this pull request Oct 13, 2024

Feature Request: Add multiple outputs for fused_dense_lib and layer_norm #18

Closed

weiji14 reviewed Oct 15, 2024

View reviewed changes

recipe/meta.yaml Outdated Show resolved Hide resolved

.github/workflows/conda-build.yml Outdated Show resolved Hide resolved

BLD: Revert debugging settings

12182c1

Co-authored-by: Wei Ji <[email protected]>

carterbox requested changes Oct 15, 2024

View reviewed changes

conda-forge-webservices[bot] and others added 2 commits October 15, 2024 16:19

MNT: Re-rendered with conda-build 24.9.0, conda-smithy 3.42.2, and co…

68a3446

…nda-forge-pinning 2024.10.15.15.47.23

Skip 313 until pytorch unbroken

ca12eba

conda-forge-webservices[bot] and others added 2 commits October 15, 2024 16:25

MNT: Re-rendered with conda-build 24.9.0, conda-smithy 3.42.2, and co…

5c8dd0e

…nda-forge-pinning 2024.10.15.15.47.23

Update .github/workflows/conda-build.yml

9ec6e0b

Co-authored-by: Wei Ji <[email protected]>

carterbox changed the title ~~flash-attn v2.6.3 + python 3.13 + TORCH_CUDA_ARCH_LIST=8.0;8.6;8.9;9.0+PTX~~ flash-attn v2.6.3 + TORCH_CUDA_ARCH_LIST=8.0;8.6;8.9;9.0+PTX Oct 15, 2024

MNT: Re-rendered with conda-build 24.9.0, conda-smithy 3.42.2, and co…

3051209

…nda-forge-pinning 2024.10.15.15.47.23

weiji14 added 2 commits October 16, 2024 07:12

Set timeout to 9 hours again

b0e6598

Unsure why rerender at 3051209 reverted to 2160 mins.

MNT: Re-rendered with conda-build 24.9.0, conda-smithy 3.42.2, and co…

38c4700

…nda-forge-pinning 2024.10.15.17.49.24

weiji14 approved these changes Oct 16, 2024

View reviewed changes

carterbox merged commit bc32f15 into conda-forge:main Oct 16, 2024
10 checks passed

regro-cf-autotick-bot deleted the 2.6.2_h62eaee branch October 16, 2024 04:27

weiji14 mentioned this pull request Nov 6, 2024

Rebuild for pytorch 2.5, add python 3.13, update to flash-attn 2.7.0.post2 #20

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flash-attn v2.6.3 + TORCH_CUDA_ARCH_LIST=8.0;8.6;8.9;9.0+PTX #10

flash-attn v2.6.3 + TORCH_CUDA_ARCH_LIST=8.0;8.6;8.9;9.0+PTX #10

regro-cf-autotick-bot commented Jul 24, 2024 •

edited by weiji14

Loading

conda-forge-webservices bot commented Jul 24, 2024 •

edited by conda-forge-admin

Loading

carterbox commented Jul 24, 2024

weiji14 commented Jul 25, 2024

carterbox commented Jul 25, 2024

carterbox commented Jul 29, 2024 •

edited

Loading

carterbox commented Jul 29, 2024

carterbox commented Jul 29, 2024

carterbox commented Jul 30, 2024

conda-forge-admin commented Oct 13, 2024

weiji14 commented Oct 13, 2024 •

edited

Loading

weiji14 left a comment

carterbox commented Oct 15, 2024 •

edited

Loading

carterbox commented Oct 15, 2024

carterbox left a comment

carterbox commented Oct 15, 2024

carterbox commented Oct 15, 2024

jakirkham commented Oct 15, 2024

weiji14 commented Oct 15, 2024

weiji14 left a comment •

edited

Loading

weiji14 Oct 16, 2024 •

edited

Loading

carterbox Oct 16, 2024

jakirkham commented Oct 16, 2024

flash-attn v2.6.3 + TORCH_CUDA_ARCH_LIST=8.0;8.6;8.9;9.0+PTX #10

flash-attn v2.6.3 + TORCH_CUDA_ARCH_LIST=8.0;8.6;8.9;9.0+PTX #10

Conversation

regro-cf-autotick-bot commented Jul 24, 2024 • edited by weiji14 Loading

Pending Dependency Version Updates

Dependency Analysis

conda-forge-webservices bot commented Jul 24, 2024 • edited by conda-forge-admin Loading

carterbox commented Jul 24, 2024

weiji14 commented Jul 25, 2024

carterbox commented Jul 25, 2024

carterbox commented Jul 29, 2024 • edited Loading

carterbox commented Jul 29, 2024

carterbox commented Jul 29, 2024

carterbox commented Jul 30, 2024

conda-forge-admin commented Oct 13, 2024

weiji14 commented Oct 13, 2024 • edited Loading

weiji14 left a comment

Choose a reason for hiding this comment

carterbox commented Oct 15, 2024 • edited Loading

carterbox commented Oct 15, 2024

carterbox left a comment

Choose a reason for hiding this comment

carterbox commented Oct 15, 2024

carterbox commented Oct 15, 2024

jakirkham commented Oct 15, 2024

weiji14 commented Oct 15, 2024

weiji14 left a comment • edited Loading

Choose a reason for hiding this comment

weiji14 Oct 16, 2024 • edited Loading

Choose a reason for hiding this comment

carterbox Oct 16, 2024

Choose a reason for hiding this comment

jakirkham commented Oct 16, 2024

regro-cf-autotick-bot commented Jul 24, 2024 •

edited by weiji14

Loading

conda-forge-webservices bot commented Jul 24, 2024 •

edited by conda-forge-admin

Loading

carterbox commented Jul 29, 2024 •

edited

Loading

weiji14 commented Oct 13, 2024 •

edited

Loading

carterbox commented Oct 15, 2024 •

edited

Loading

weiji14 left a comment •

edited

Loading

weiji14 Oct 16, 2024 •

edited

Loading