-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge recent changes from ROCm xformers #1196
base: main
Are you sure you want to change the base?
Conversation
Avoid unused-const-variable warning
… the fmd_bwd_kernel
…_stride_dv parameters
…erated reference headers
[CK] Memory-efficient attention (Head Dimension = 512)
Remove using splitkv kernel from fmha fwd training path
…rs been suppressed
Disable PagedAttn bias types and hdim-512 for test_logsumexp
hotfix typo
update to rocm 6.3 wheels
Enable hdim=512 by default
Further update to build hdim-512 by default
Merge upstream into ROCM develop
Let's hold off a bit, I'm still working on merging the prior PR. Can we make sure all mem efficient tests are passing? |
This reverts commit 84883b5.
I just pushed commit f858c, let the forward training path still able to use splitkv-kernel, since without doing this, the benchmark scripts which uses #> pytest tests/test_mem_eff_attention.py::test_forward will have 14 bfloat16 cases failed even when |
Hi, is this ready to merge? We would like to do a new release soon (PT 2.6 is just out) |
This PR provides
xformers/benchmarks/benchmark_attn_decoding.py
to make it works correctly for ck.FwOpThe following scripts are used to test/verify the changes
The following script is used to benchmark/verify the performance of decoder with mqa/gqa using ck.FwOp
#> python xformers/benchmarks/benchmark_attn_decoding.py