-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SWP] Print recurring dependencies when reporting scheduling conflicts #5375
base: main
Are you sure you want to change the base?
Conversation
lib/Dialect/TritonGPU/Transforms/Pipeliner/PipelineExpander.cpp
Outdated
Show resolved
Hide resolved
// } | ||
// } | ||
|
||
DenseSet<Operation *> findRootDefiningOp(Operation *op, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought what we had discussed was to have this logic separate from the transformation code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Would it be acceptable to encapsulate this logic into a separate class and move it to a new source file? This way, we can maintain a clear separation between the transformation code and this logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what part of this file do you need? Is it just the check to know if the schedule is valide?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if I get your question, but this PR does not introduce more checks. This PR provides detailed error message when distance = 1, per discussions with Pawel. findRootDefiningOp
is to support the detailed error message.
I can get your point to separate checks from transformation. For this PR, how about I put related impl of printSchedulingError
into an separate file such as ErrorPrinter.cpp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to understand what is exactly needed here. Do we want a detailed error message for compiler developers to help them fix the issue, or just a message to the user that the compiler failed to fulfill a schedule?
For the former, we may need a message tied to the transformation and the current implementation looks helpful.
For the latter we might need more simplified messaging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was that it would be good to do this outside of the expander rather. It doesn't look like it needs to reuse much so this could be implemented as a separate verification + diagnostic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should make this independent of the expander. Expander is supposed to stay close to the LLVM upstream implementation, and most likely replaced by it someday. The new error is more informative and actionable, which is great, but it would be great if we could make it independent from the expander. Perhaps a verification pass that we pass the schedule vector to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pawelszczerbuk I see. Passing the schedule vector to a separate pass makes sense to me. Do you want to refactor the existing verification part in the expander to a new pass as well? Right now I am only adding an extra error reporting to the verification. My PR lacks some comments - it looks complicated but they just collect more information for the error message. There's no extra verification.
Thank you all for the discussion! Anyhow, let me first improve my code quality: 1) adding more comments and 2) moving into a separate file. Then I will try to further separate them into a pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sfzhu93 ! Let's not modify the expander for now. If we'll do it, it should be separate PR that is in sync with change in the upstream LLVM expander. It can come later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pawelszczerbuk Thanks! I have cleaned up the code and I have minimum change to the expander. The PR is not yet finished - to make the error more informative, the error reporter will likely depend on the schedule of each Op as well.
I can further make it into a separate pass if you feel it is necessary.
lib/Dialect/TritonGPU/Transforms/Pipeliner/PipelineExpander.cpp
Outdated
Show resolved
Hide resolved
23fa565
to
1fb19f7
Compare
include/triton/Dialect/TritonGPU/Transforms/PipelineErrorReporter.h
Outdated
Show resolved
Hide resolved
lib/Dialect/TritonGPU/Transforms/Pipeliner/PipelineErrorReporter.cpp
Outdated
Show resolved
Hide resolved
49b9517
to
30ed20c
Compare
30ed20c
to
11930a8
Compare
@@ -264,6 +265,9 @@ bool LoopPipelinerInternal::verifySchedule() { | |||
diag.attachNote(producer->getLoc()) | |||
.append("operand defined here: ") | |||
.appendOp(*producer, OpPrintingFlags().printGenericOpForm()); | |||
PipelineErrorReporter errorReporter(forOp, maxStage + 1, stages); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pawelszczerbuk @Mogball I noticed your PRs #5726 and #5867. I left them unchanged in case that are related to your internal workload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I can check if MLIR_ENABLE_DIAGNOSTICS=operations
is set here to include the ops in the notes. Maybe do it in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay with this patch in general, since it is isolated. We can refine the logic later on as we hit more cases where SWP failed to work due to "operation scheduled before its operands".
This PR enhances error messaging for scheduling conflicts in software pipelining, providing clearer guidance for debugging issues. Users will now receive prompts to reposition the producer earlier in the loop body to facilitate pipelining.
Previously, error messages only indicated that a consumer was scheduled before its producer, which was insufficient for effective debugging. Additionally, the location of the producer's definition was unclear, especially when the actual producer is inside an SCF IfOp.
This PR improves error tracing by backtracking the data flow into nested MLIR blocks of IfOps to identify the root definition of the producer and the root user of the consumer. This is especially useful for persistent kernels, as demonstrated in #5172, which is now included as a unit test. Another example involving a fused persistent matmul is also included as a test. Many users adapt persistent kernels from the official tutorial for their implementations, and this PR is useful for debugging.
This PR currently tracks only IfOps. We leave other cases as future work.
New contributor declaration
I am not making a trivial change, such as fixing a typo in a comment.
I have written a PR description following these
rules.
I have run
pre-commit run --from-ref origin/main --to-ref HEAD
.Select one of the following.
/python/test
for end-to-end testsFILL THIS IN
.Select one of the following.
lit
tests.lit
tests I have added follow these best practices,including the "tests should be minimal" section. (Usually running Python code
and using the instructions it generates is not minimal.)