Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIE2P] Enable S20 Narrowing for FIFO Loads and Stores #333

Open
wants to merge 1 commit into
base: aie-public
Choose a base branch
from

Conversation

abhinay-anubola
Copy link
Collaborator

Added FIFO load and store intrinsics as S20 consumers in S20Narrowing
Added and updated the tests for the same

@abhinay-anubola abhinay-anubola force-pushed the sanubola.s20narrowing.fifo.ld.st branch from 1e3c4ac to 54f625d Compare February 4, 2025 15:48
@@ -9,7 +9,7 @@
//===----------------------------------------------------------------------===//
include "llvm/Target/GlobalISel/Combine.td"

// AIE-specific offset folding for G_GLOBAL_VALUE.
// AIE-specific offset folding for G_GLOBALllvm/lib/Target/AIE/AIECombine.td_VALUE.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

editor glitch?

case Intrinsic::aie2p_fifo_ld_pop_576_1d_bfp16:
case Intrinsic::aie2p_fifo_ld_pop_576_2d_bfp16:
case Intrinsic::aie2p_fifo_ld_pop_576_3d_bfp16:
case Intrinsic::aie2p_fifo_st_flush_2d:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have more intrinsics available, maybe double check they are all present

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether all scalar operands of these instructions are accurately qualified as 'S20'. It at least seems likely that they don't consume more than 20 bits.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See:

%4 = tail call { ptr, <32 x i32>, i32, <64 x i8>, <8 x i8> } @llvm.aie2p.fifo.ld.pop.576.1d.bfp16(ptr %0, <32 x i32> %1, i32 %2, i20 %3)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether all scalar operands of these instructions are accurately qualified as 'S20

The availability register is indeed an s32, and we should leave it as is.

default:
case TargetOpcode::G_INTRINSIC_W_SIDE_EFFECTS: {
const unsigned IntrinsicID = cast<GIntrinsic>(Use).getIntrinsicID();
if (!isNativeS20ConsumerIntrinsic(IntrinsicID)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice you factored that out!

tracksRegLiveness: true
body: |
bb.1:
liveins: $p0, $p1, $p2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: $p2 is not used.

%8:_(s32) = G_PHI %2(s32), %bb.0, %9(s32), %bb.1
%10:_(s20) = G_TRUNC %6(s32)
%11:_(s20) = G_TRUNC %8(s32)
%12:_(p0), %13:_(<32 x s32>), %14:_(s32), %15:_(s20), %16:_(s20), %17:_(<64 x s8>), %18:_(<8 x s8>) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.aie2p.fifo.ld.pop.544.3d.bfp16), %0(p0), %5(<32 x s32>), %2(s32), %1(s20), %1(s20), %11(s20), %1(s20), %1(s20), %10(s20), %1(s20), %3(<64 x s8>), %4(<8 x s8>)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test where %0(p0), %5(<32 x s32>), %2(s32), are also taken from phi nodes?

@@ -671,16 +694,14 @@ bool canNarrowUserTreeToS20(MachineRegisterInfo &MRI, InstrNode Start,
case TargetOpcode::G_STORE: // Data operand is later modified to S20 type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This switch seems to be an extension to isNativeS20Consumer. I guess the difference is given by 'can be adapted to'. Could we delegate the trivial cases to that predicate function?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants