-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIE2P] Enable S20 Narrowing for FIFO Loads and Stores #333
base: aie-public
Are you sure you want to change the base?
Conversation
1e3c4ac
to
54f625d
Compare
@@ -9,7 +9,7 @@ | |||
//===----------------------------------------------------------------------===// | |||
include "llvm/Target/GlobalISel/Combine.td" | |||
|
|||
// AIE-specific offset folding for G_GLOBAL_VALUE. | |||
// AIE-specific offset folding for G_GLOBALllvm/lib/Target/AIE/AIECombine.td_VALUE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
editor glitch?
case Intrinsic::aie2p_fifo_ld_pop_576_1d_bfp16: | ||
case Intrinsic::aie2p_fifo_ld_pop_576_2d_bfp16: | ||
case Intrinsic::aie2p_fifo_ld_pop_576_3d_bfp16: | ||
case Intrinsic::aie2p_fifo_st_flush_2d: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now have more intrinsics available, maybe double check they are all present
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering whether all scalar operands of these instructions are accurately qualified as 'S20'. It at least seems likely that they don't consume more than 20 bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See:
%4 = tail call { ptr, <32 x i32>, i32, <64 x i8>, <8 x i8> } @llvm.aie2p.fifo.ld.pop.576.1d.bfp16(ptr %0, <32 x i32> %1, i32 %2, i20 %3)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering whether all scalar operands of these instructions are accurately qualified as 'S20
The availability register is indeed an s32
, and we should leave it as is.
default: | ||
case TargetOpcode::G_INTRINSIC_W_SIDE_EFFECTS: { | ||
const unsigned IntrinsicID = cast<GIntrinsic>(Use).getIntrinsicID(); | ||
if (!isNativeS20ConsumerIntrinsic(IntrinsicID)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice you factored that out!
tracksRegLiveness: true | ||
body: | | ||
bb.1: | ||
liveins: $p0, $p1, $p2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: $p2
is not used.
%8:_(s32) = G_PHI %2(s32), %bb.0, %9(s32), %bb.1 | ||
%10:_(s20) = G_TRUNC %6(s32) | ||
%11:_(s20) = G_TRUNC %8(s32) | ||
%12:_(p0), %13:_(<32 x s32>), %14:_(s32), %15:_(s20), %16:_(s20), %17:_(<64 x s8>), %18:_(<8 x s8>) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.aie2p.fifo.ld.pop.544.3d.bfp16), %0(p0), %5(<32 x s32>), %2(s32), %1(s20), %1(s20), %11(s20), %1(s20), %1(s20), %10(s20), %1(s20), %3(<64 x s8>), %4(<8 x s8>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a test where %0(p0), %5(<32 x s32>), %2(s32),
are also taken from phi nodes?
@@ -671,16 +694,14 @@ bool canNarrowUserTreeToS20(MachineRegisterInfo &MRI, InstrNode Start, | |||
case TargetOpcode::G_STORE: // Data operand is later modified to S20 type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This switch seems to be an extension to isNativeS20Consumer
. I guess the difference is given by 'can be adapted to'. Could we delegate the trivial cases to that predicate function?
Added FIFO load and store intrinsics as S20 consumers in S20Narrowing
Added and updated the tests for the same