Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mi50 Support #29

Open
YehowshuaScaled opened this issue Dec 31, 2023 · 3 comments
Open

Mi50 Support #29

YehowshuaScaled opened this issue Dec 31, 2023 · 3 comments

Comments

@YehowshuaScaled
Copy link

I was able to build flash-attention ROCM for both my Mi100 and Mi50 cards, but only got flash attention working on the Mi100(very impressive performance I might add).

Trying to run flash attention on the Mi50 delivered the following error:
RuntimeError: DeviceGroupedMultiheadAttentionForward_Xdl_CShuffle_V2<256, 128, 128, 32, 8, 8, 128, 128, 32, 2, Default, ASpecDefault, B0SpecDefault, B1SpecDefault, CSpecDefault, MaskUpperTriangleFromTopLeft> does not support this problem

How hard would it be to port FA to the Mi50. Happy to pay/hire for support on this as I have a rather large stockpile of Mi50s.

@jayz0123
Copy link

jayz0123 commented Jan 23, 2024

Hi @YehowshuaScaled. I think it would be better to ask the CK team to see if they are going to support MI50. It won't be an issue if they have FA kernels running on MI50.

#RuntimeError: DeviceGroupedMultiheadAttentionForward_Xdl_CShuffle_V2<256, 128, 128, 32, 8, 8, 128, 128, 32, 2, Default, ASpecDefault, B0SpecDefault, B1SpecDefault, CSpecDefault, MaskUpperTriangleFromTopLeft> does not support this problem

This error is actually raised from the CK backend.

@differentprogramming
Copy link

I notice this line in setup.py
allowed_archs = ["native", "gfx90a", "gfx908", "gfx940", "gfx941", "gfx942"]
I'm sad that gfx906 isn't there since I have an MI50 as well.

@linchen111
Copy link

I was able to build flash-attention ROCM for both my Mi100 and Mi50 cards, but only got flash attention working on the Mi100(very impressive performance I might add).我能够为我的 Mi100 和 Mi50 卡构建 flash-attention ROCM,但只在 Mi100 上实现 flash-attention(我可能会补充非常令人印象深刻的性能)。

Trying to run flash attention on the Mi50 delivered the following error:尝试在 Mi50 上运行 Flash Attention 时出现以下错误: RuntimeError: DeviceGroupedMultiheadAttentionForward_Xdl_CShuffle_V2<256, 128, 128, 32, 8, 8, 128, 128, 32, 2, Default, ASpecDefault, B0SpecDefault, B1SpecDefault, CSpecDefault, MaskUpperTriangleFromTopLeft> does not support this problem运行时错误:DeviceGroupedMultiheadAttentionForward_Xdl_CShuffle_V2<256 , 128, 128, 32, 8, 8, 128, 128, 32, 2, Default, ASpecDefault, B0SpecDefault, B1SpecDefault, CSpecDefault, MaskUpperTriangleFromTopLeft>不支持这个问题

How hard would it be to port FA to the Mi50. Happy to pay/hire for support on this as I have a rather large stockpile of Mi50s.将 FA 移植到 Mi50 上有多难?很高兴支付/雇用这方面的支持,因为我有相当大的 Mi50 库存。

did you solve this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants