[Backend] Codegen for `ttg.warp_specialize` #5968

Mogball · 2025-02-20T00:54:00Z

This PR primarily implements ConvertWarpSpecializeToLLVM, a pass that runs after all other LLVM conversion that rewrites a warp-specialized function, removing ttg.warp_specialize ops. This pass generates synchronization by putting all other warp groups into a waiting loop, where each warp waits for a state ID to be populated into shared memory. The ID represents the switch case the warp should branch to.

This PR also does a few other things:

Correctly plumbs ttg.total-num-warps through the compiler so that warp specialization works correctly in the frontend
Starts building a very basic model for storing scalar types into shared memory (tightly packed) that can be generalized as needed

This is in preparation for warp specialization, which will turn the number of warps into a scoped property of regions. This PR just rearranges the API for looking up the number of warps. In the next PR, the `"ttg.num-warps"` attribute will be moved to `tt.func`.

Warp specialization will cause these to become relative to the current warpgroup, so funnel all the code through a set of common APIs.

ThomasRaoux

Looks great, few minor questions, also I wonder if we really need the llvm data layout as the rules for shared memory allocation are pretty simple in general

lib/Analysis/AxisInfo.cpp

lib/Dialect/TritonGPU/IR/Types.cpp

third_party/nvidia/lib/TritonNVIDIAGPUToLLVM/ConvertWarpSpecializeToLLVM.cpp

test/Analysis/test-allocation.mlir

Jokeren

Do you still use data layout in the code? If not, can you update your PR description?

Jokeren · 2025-02-26T21:38:58Z

test/Analysis/test-allocation.mlir

@@ -875,4 +889,26 @@ tt.func @two_different_ws() {
  tt.return
 }

+// expected-remark @below {{ptr_allocation_datalayout}}
+// expected-remark @below {{size = 8}}
+tt.func @ptr_allocation_datalayout(%arg0: !tt.ptr<i32>) {


Do you still need this test?

Yes, but now it's just testing that PointerType can have its size queried (just not through llvm::DataLayout)

Mogball · 2025-02-26T21:41:08Z

Do you still use data layout in the code? If not, can you update your PR description?

Ah yes, let me update that.

ThomasRaoux

LGTM

Mogball added 30 commits February 11, 2025 15:48

wip num-warps

4e7b21b

fmt

9c72799

Merge remote-tracking branch 'origin/main' into mogball/ws

519edc8

clean up thread ID access

6455147

start refactoring

d77d445

[BACKEND] Refactor how thread/lane/warp IDs are created (NFC)

4434fe4

Warp specialization will cause these to become relative to the current warpgroup, so funnel all the code through a set of common APIs.

Merge remote-tracking branch 'origin/main' into mogball/ws1

d601651

start

0ce599b

remove emitHardwareTuple

a92a438

Merge branch 'mogball/ws1' into mogball/ws2

9b8e05d

x

f133abb

Merge remote-tracking branch 'origin/main' into mogball/ws1

84ae862

merge main

280f10e

Merge branch 'mogball/ws1' into mogball/ws2

91b9af6

add ttg.warp_specialize op

4583b53

clean up warp attrs

d65a38d

add tests for layout invariants

384b321

more tests

a86fe07

add op documentations

b7e2a8c

fix pass defs

6bb80ef

x

db724f8

Merge remote-tracking branch 'origin/main' into mogball/ws2

2e13070

Merge branch 'mogball/ws2' into mogball/ws3

15240b1

finish writing pass

90343df

add test

a409ef0

more refactoring

67fe223

relative threadid

cbfce0e

rewrite allocation tests

81700bd

x

db30ead

Mogball added 9 commits February 24, 2025 16:50

fix type conversion

70c84d9

integration tests

915fc2d

another integration test

124d6c9

Merge remote-tracking branch 'origin/main' into mogball/ws3

08a3cee

Merge branch 'mogball/ws3' into mogball/ws4

39cb456

fix things

4d20668

fmt

6b78ebd

document virtual block

3155489

Merge branch 'mogball/ws3' into mogball/ws4

a6c0ac7

Mogball changed the title ~~[WIP][DNR] Codegen for ttg.warp_specialize~~ [Backend] Codegen for ttg.warp_specialize Feb 25, 2025

Mogball requested a review from ThomasRaoux February 25, 2025 18:53

Mogball marked this pull request as ready for review February 25, 2025 18:53

Mogball requested review from antiagainst, zhanglx13, ptillet and Jokeren as code owners February 25, 2025 18:53

Mogball added 3 commits February 25, 2025 11:03

fudge some unifomity

c32e75e

skip test on AMD

20acb1d

run on A100 anyways

d9a27c9

Base automatically changed from mogball/ws3 to main February 25, 2025 20:10

Merge remote-tracking branch 'origin/main' into mogball/ws4

ab92c09

ThomasRaoux reviewed Feb 26, 2025

View reviewed changes

lib/Analysis/AxisInfo.cpp Show resolved Hide resolved

lib/Dialect/TritonGPU/IR/Types.cpp Outdated Show resolved Hide resolved

third_party/nvidia/lib/TritonNVIDIAGPUToLLVM/ConvertWarpSpecializeToLLVM.cpp Show resolved Hide resolved

test/Analysis/test-allocation.mlir Show resolved Hide resolved

Mogball added 3 commits February 26, 2025 13:21

strip out DataLayout plumbing

095963d

add desc

db5df4a

fmt

e92ecff

Jokeren reviewed Feb 26, 2025

View reviewed changes

ThomasRaoux approved these changes Feb 27, 2025

View reviewed changes

Mogball merged commit fa2a8f3 into main Feb 27, 2025
7 checks passed

Mogball deleted the mogball/ws4 branch February 27, 2025 19:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Backend] Codegen for `ttg.warp_specialize` #5968

[Backend] Codegen for `ttg.warp_specialize` #5968

Mogball commented Feb 20, 2025 •

edited

Loading

ThomasRaoux left a comment

Jokeren left a comment

Jokeren Feb 26, 2025

Mogball Feb 26, 2025

Mogball commented Feb 26, 2025

ThomasRaoux left a comment

[Backend] Codegen for ttg.warp_specialize #5968

[Backend] Codegen for ttg.warp_specialize #5968

Conversation

Mogball commented Feb 20, 2025 • edited Loading

ThomasRaoux left a comment

Choose a reason for hiding this comment

Jokeren left a comment

Choose a reason for hiding this comment

Jokeren Feb 26, 2025

Choose a reason for hiding this comment

Mogball Feb 26, 2025

Choose a reason for hiding this comment

Mogball commented Feb 26, 2025

ThomasRaoux left a comment

Choose a reason for hiding this comment

[Backend] Codegen for `ttg.warp_specialize` #5968

[Backend] Codegen for `ttg.warp_specialize` #5968

Mogball commented Feb 20, 2025 •

edited

Loading