Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMD] [FA] Hoist convert_layout to dotOp for Q out of the loop #6017

Merged
merged 3 commits into from
Feb 26, 2025

Conversation

zhanglx13
Copy link
Collaborator

@zhanglx13 zhanglx13 commented Feb 25, 2025

This PR adds a new amd.pass that hoists conver_layout to dotOperand layout for the Q tensor out of the loop. Therefore, Q tensor is kept in registers instead of being loaded at every iteration of the loop.

This PR is actually achieving the same thing as #4901. However, #4901 does not hoist local_load for Q in the epilogue, making Q tensor live in shared memory all the time.
On the other hand, this PR does the trick before stream-pipeline pass. Therefore, the livessness of Q tensor in shared memory is limited in the prologue.

Copy link
Contributor

@sjw36 sjw36 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and much more simple. Thanks!

This PR adds a new amd.pass that hoists conver_layout to dotOperand
layout for the Q tensor out of the loop. Therefore, Q tensor is kept
in registers instead of being loaded at every iteration of the loop.

This PR is actually achieving the same thing as
triton-lang#4901. However,
triton-lang#4901 does not hoist
local_load for Q in the epilogue, making Q tensor live in shared
memory all the time.
On the other hand, this PR does the trick before stream-pipeline
pass. Therefore, the livessness of Q tensor in shared memory is
limited in the prologue.
@antiagainst antiagainst marked this pull request as ready for review February 26, 2025 16:00
@antiagainst antiagainst requested a review from ptillet as a code owner February 26, 2025 16:00
@zhanglx13 zhanglx13 merged commit e24d693 into triton-lang:main Feb 26, 2025
7 checks passed
@antiagainst antiagainst deleted the hoist_cvt branch February 28, 2025 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants