-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Handshake] Support for extra signals in operations #225
base: main
Are you sure you want to change the base?
Conversation
mlir::Type resultType = $_op->getResult(0).getType(); | ||
|
||
for (auto operand : operands) | ||
if (operand.getType() != resultType) | ||
return concreteOp.emitOpError("operand has type ") << operand.getType() | ||
<< ", but result has type " << resultType; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type checking between operands and result is (no longer) unnecessary here.
For CMerge and Mux, the type is no longer the same between operands and result.
For MergeOp, SameOperandsAndResultType
is already constrained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to check if the operands' data fields all have the same type? I guess we still want to avoid situations like channel<i32, [spec : i1]> and channel
Update: Now I see that you have some other constraints to check that
@lucas-rami (CC @murphe67 ) Thank you for reviewing this pull request. (1) New assembly format for mux / control_merge The extra signals of the inputs of these operations can now be uneven, so we need to explicitly specify all input types in the IR. For better readability, I propose the following format:
Let me know your thoughts on this. Once we agree, I’ll update the rest of the existing unit tests accordingly. (2) PredOpTraits like I noticed you’ve implemented OpInterfaces like
I'd like to hear your opinion on this approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the hard work Shun!
I would love a nice markdown file describing the type constraints on each operation we can add to the documentation folder?
Other than that I have a couple comments on function names, variables and comments. :)
22eefaf
to
575fd80
Compare
@murphe67 I added a document, so I'd be glad if you could review it as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I absolutely love the document Shun, thank you! I have some superficial opinions, as always ahahahaha.
@AyaElAkhras In the latest commit, I updated the newly created document for this PR to include the implementation description in Section 2. It's still a draft, so the structure or details might change, but the information is mostly finalized. Please take a look, and feel free to share any suggestions! Link: https://github.com/EPFL-LAP/dynamatic/blob/ops-with-extra-signals/docs/ExtraSignalsForOperations.md |
def StoreOp : Handshake_MemPortOp<"store", [ | ||
AllExtraSignalsMatch<["address", "data"]>, | ||
// In StoreOp, addressResult and dataResult are connected to a memory controller. | ||
IsSimpleHandshake<"addressResult">, | ||
IsSimpleHandshake<"dataResult"> | ||
], []> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized StoreOp is usually not inside a speculative region (or an out-of-order region I guess). Should I just constrain all the operands/results to be simple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on if @AyaElAkhras has tagged stores?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my old implementation, I had a somehow pointless tagged store that takes extra tag ports associated with the address and data but does nothing with them internally because the MC and LSQ (so far) do not have support for tags. I only added them to make the interface consistent (when the components calculating the address and data are tagged).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we have two options: 1) enforce that tags cannot reach stores while we work on speculation, since it should not happen under our current implementation, and then you can edit the validation to what actually makes sense for you in the new implementation or 2) allow tags to reach stores now, and assume the same kind-of-pointless tagged stores will exist.
@shundroid, could we add a validation that specifically spec tags should not reach stores somehow? maybe in the algorithm that adds the spec tag to the IR, but not part of the core validation system?
On the other hand, if we add it to the core validation system, if anyone runs an optimization pass after speculation, they will know if they have broken speculation or not, which may be a good thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we add a validation that specifically spec tags should not reach stores somehow? maybe in the algorithm that adds the spec tag to the IR, but not part of the core validation system?
Yes, it's easy. I'll add it.
when the components calculating the address and data are tagged
@AyaElAkhras Does this happen when the store op is placed between the tagger and untagger but after the aligner (as described in the paper)?
If store ops must be in the "tagged" region, I'd go with Emmet's option 2), even if the tags are ineffective. But if the data and address operands can be easily made tagless, I'd prefer option 1).
On the other hand, if we add it to the core validation system, if anyone runs an optimization pass after speculation, they will know if they have broken speculation or not, which may be a good thing.
For this use case, maybe we want to add a trait like OnlyOutsideSpecRegion
in the future to ensure that none of the operands have spec tag (if we take option 12 for now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think option 2 with the check inside the algorithm makes the most sense for now, but am pretty open. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry guys, I got swamped with a completely unexpected issue. I hope it gets resolved by tonight and I will study what I missed then and let you know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally managed to process this (apologies again for the delay!). From the document, I guess you ended up with option 2)? If yes, it aligns with my old implementation, so I'm happy :D
@AyaElAkhras Does this happen when the store op is placed between the tagger and untagger but after the aligner
(as described in the paper)?
Yes.
Thanks a lot @shundroid! I will process it as soon as I can and get back to you. |
I realized this was confusing, so I updated the implementation to define only |
I noticed some operations in Since this PR already has a large diff, I'll update those operations and remove |
As far as my opinion matters, I think the doc file you provided is a great source of information! Thanks for the work :) |
Sorry guys, I got swamped with a completely unexpected issue. I hope it gets resolved by tonight and I will study what I missed then and let you know. |
Thanks a lot @shundroid and @murphe67 for the clear and nice presentation and for all the effort! I processed the document and left a few clarification questions and suggestions. Other than those, I think everything is perfect :) |
Thank you, @AyaElAkhras ! I can't see your comments, are they still in draft? Could you click the Submit Review button? |
|
||
- The data output includes an extra signal `A` if, and only if, at least one of the inputs carries the extra signal `A`. | ||
|
||
The selector input (for Mux) or the output (for CMerge) is kept simple, meaning it does not carry any extra signals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also caught my attention. In FPGA'24, we have a tagged Mux and a tagged CMerge, and for those, the select carries a tag signal as well. If I understood it right, this will not currently work, right? How hard it is to make it work?
|
||
The data output has `spec: i1` and `tag: i8` because some inputs have them, and nothing else. | ||
|
||
The specification for the output extra signals implies that if an input is selected but lacks a specific extra signal present in other inputs, the Mux or CMerge must provide the value of the missing extra signal for the output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure I understand what this means:
- If the output of the Mux has an additional
tag
signal andin0
of the Mux does not have this signal and is chosen to be passed to the output, then the Mux will have some default value to put in the outputtag
signal? - I guess in [TypeSystem] Converting untagged values to default tag #226 we were also speaking of Source nodes, does this apply to them as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes- exactly what values the mux/cmerge/source provides will be covered in the signal manager prs, like #274
|
||
### MuxOp and CMergeOp | ||
|
||
These operations may have different extra signals for each input because they typically reside at the boundary of a basic block, receiving inputs from various blocks. For instance, the extra signals on the inputs of a MuxOp might look like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that Muxes and CMerges are special in that sense, but Merges are also similar to them. We may one day want to have this flexibility for Merges as well— is there a reason not to extend it to Merges?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, these rules are super flexible/editable. We currently do not do it intentionally, so the validation system prevents it.
The key with this system is to have a clear document with all the rules and reasons, but the PR more contributes the system and a first set of rules.
|
||
More on operation arguments: https://mlir.llvm.org/docs/DefiningDialects/Operations/#operation-arguments | ||
|
||
Each operation also has **results**, which represent the outputs of the RTL here. For instance, `ConditionalBranchOp` has two results, corresponding to the "true" and "false" branches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could support a variable number of results too (i.e., variadic results of MLIR), right? For instance, I may want to add a Branch with a variable number of outputs not just 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fork is an example. I'll clarify this in the document tomorrow
- `MergingExtraSignals` – Validates extra signal consistency across the data inputs and data output. | ||
- `AllDataTypesMatchWithVariadic` – Ensures uniform data types across the data inputs and data output. | ||
|
||
Additionally, the `selector` port is of type `SimpleChannel`, as it does not carry extra signals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But, this can be problematic (see my earlier question). Which type of constraint is enforced here? And, can we choose not to enforce it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SimpleChannel
is kind of a constraint here, to enforce the channel is without any extra signals.
Yes, it's easy to enforce/not enforce these constraints
docs/ExtraSignalsForOperations.md
Outdated
The following constraints ensure proper handling of extra signals: | ||
|
||
- `MergingExtraSignals` – Validates extra signal consistency across the data inputs and data output. | ||
- `AllDataTypesMatchWithVariadic` – Ensures uniform data types across the data inputs and data output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How different is this from AllDataTypesMatch
? Is the former (AllDataTypesMatch
) not supporting variable number of inputs/outputs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're correct. I'll emphasize that (across the data inputs and variadic data output), thanks!
|
||
Next, we’ll take a closer look at how these rules are implemented. We’ll begin by introducing some fundamental concepts. | ||
|
||
### Operations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear to me what kind of constraints do you impose on Forks? They may or may not fall under 'operations within a basic block'. Regardless, I may want a two-output Fork where the input consists of an i32 data signal along with an extra tag signal, and one output should contain only the tag signal, while the other should consist of the i32 data signal.
Can this work, or will it violate some constraint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently this would violate the constraint, since we don't have a way to generate that kind of a fork.
We have this exact situation in speculation, and we are currently planning to add conversion operations under the relevant fork outputs- your suggestion could be used instead, so I might ask Lana what she thinks about it.
(I guess my answer is again: we implemented rules which correspond to speculation, but the goal of this PR is really that whoever needs to change the rules is able to easily)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks @murphe67! :)
def StoreOp : Handshake_MemPortOp<"store", [ | ||
AllExtraSignalsMatch<["address", "data"]>, | ||
// In StoreOp, addressResult and dataResult are connected to a memory controller. | ||
IsSimpleHandshake<"addressResult">, | ||
IsSimpleHandshake<"dataResult"> | ||
], []> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally managed to process this (apologies again for the delay!). From the document, I guess you ended up with option 2)? If yes, it aligns with my old implementation, so I'm happy :D
@AyaElAkhras Does this happen when the store op is placed between the tagger and untagger but after the aligner
(as described in the paper)?
Yes.
Opps, sorry, just submitted! |
We needed additional changes in handshake operations to fully utilize the extra signals in the circuit. The changes in this PR are general and not specific to speculation.
Changes:
ConstantOp
now receivesControlType
(possibly with extra signals) and propagates it to the result.MuxOp
andCMergeOp
accept variadic inputs with "uneven" extra signals.AllDataTypesMatch
,AllExtraSignalsMatch
,AllDataTypesMatchWithinVariadic
,MergingExtraSignals
, etc.MemPortOp
(LoadOp
,StoreOp
),ControlMergeOp
,MuxOp
,ConstantOp
CmpIOp
andCmpFOp
I think some implementation might be difficult to understand (especially MergingExtraSignals constraint), though I tried to add as many comments as I can.
Please feel free to let me know if something is unclear.