Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backend] Codegen for ttg.warp_specialize #5968

Merged
merged 87 commits into from
Feb 27, 2025
Merged

[Backend] Codegen for ttg.warp_specialize #5968

merged 87 commits into from
Feb 27, 2025

Conversation

Mogball
Copy link
Collaborator

@Mogball Mogball commented Feb 20, 2025

This PR primarily implements ConvertWarpSpecializeToLLVM, a pass that runs after all other LLVM conversion that rewrites a warp-specialized function, removing ttg.warp_specialize ops. This pass generates synchronization by putting all other warp groups into a waiting loop, where each warp waits for a state ID to be populated into shared memory. The ID represents the switch case the warp should branch to.

This PR also does a few other things:

  • Correctly plumbs ttg.total-num-warps through the compiler so that warp specialization works correctly in the frontend
  • Starts building a very basic model for storing scalar types into shared memory (tightly packed) that can be generalized as needed

This is in preparation for warp specialization, which will turn the
number of warps into a scoped property of regions.

This PR just rearranges the API for looking up the number of warps. In
the next PR, the `"ttg.num-warps"` attribute will be moved to `tt.func`.
Warp specialization will cause these to become relative to the current
warpgroup, so funnel all the code through a set of common APIs.
@Mogball Mogball changed the title [WIP][DNR] Codegen for ttg.warp_specialize [Backend] Codegen for ttg.warp_specialize Feb 25, 2025
@Mogball Mogball requested a review from ThomasRaoux February 25, 2025 18:53
@Mogball Mogball marked this pull request as ready for review February 25, 2025 18:53
Base automatically changed from mogball/ws3 to main February 25, 2025 20:10
Copy link
Collaborator

@ThomasRaoux ThomasRaoux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, few minor questions, also I wonder if we really need the llvm data layout as the rules for shared memory allocation are pretty simple in general

Copy link
Contributor

@Jokeren Jokeren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you still use data layout in the code? If not, can you update your PR description?

@@ -875,4 +889,26 @@ tt.func @two_different_ws() {
tt.return
}

// expected-remark @below {{ptr_allocation_datalayout}}
// expected-remark @below {{size = 8}}
tt.func @ptr_allocation_datalayout(%arg0: !tt.ptr<i32>) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you still need this test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but now it's just testing that PointerType can have its size queried (just not through llvm::DataLayout)

@Mogball
Copy link
Collaborator Author

Mogball commented Feb 26, 2025

Do you still use data layout in the code? If not, can you update your PR description?

Ah yes, let me update that.

Copy link
Collaborator

@ThomasRaoux ThomasRaoux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Mogball Mogball merged commit fa2a8f3 into main Feb 27, 2025
7 checks passed
@Mogball Mogball deleted the mogball/ws4 branch February 27, 2025 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants