Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul G_EXTRACT_VECTOR_ELT legalization strategy #328

Merged
merged 11 commits into from
Feb 4, 2025

Conversation

konstantinschwarz
Copy link
Collaborator

We now legalize all G_AIE_[ZS]EXT_EXTRACT_VECTOR_ELT/G_EXTRACT_VECTOR_ELT to have 512-bit source vectors.
This allows us to delete a lot of redundant instruction selection code and makes the code much more re-usable across AIE architectures.

Best to review commit by commit, each one should be self-contained.

case AIE2::G_AIE_ZEXT_EXTRACT_VECTOR_ELT:
case AIE2::G_AIE_SEXT_EXTRACT_VECTOR_ELT: {
const LLT SrcVecTy = MRI.getType(MI.getOperand(1).getReg());
if (SrcVecTy.getSizeInBits() != 512) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have a BasicVectorBitSize in a base class?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe just a static attribute within AIE2PreLegalizerCombinerImpl

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have the magic number 512 in multiple files...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recently introduced getExtractSubvecNativeSrcSize in AIEBaseInstrInfo.h.
I guess I could rename that function to getBasicVectorBitSize() and use it in existing places and the change here.

assert(MI.getOpcode() == TargetOpcode::G_UNMERGE_VALUES);

const int StartIdx = Regs.size();
const int NumResults = MI.getNumOperands() - 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there an MI.defs() that we could iterate over?

const int StartIdx = Regs.size();
const int NumResults = MI.getNumOperands() - 1;
Regs.resize(Regs.size() + NumResults);
for (int I = 0; I != NumResults; ++I)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not < NumResults?

Copy link
Collaborator

@martien-de-jong martien-de-jong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I have far too many comments for such an heroic effort. In general it looks good to me. I would mainly like to avoid uninitialized locals, and have const on declarations that have longer scope than just a few lines.

@@ -995,6 +995,10 @@ bool llvm::matchExtractVecEltAndExt(
const LLT S8 = LLT::scalar(8);
const LLT S16 = LLT::scalar(16);
LLT SrcVecTy = MRI.getType(MI.getOperand(1).getReg());
// Extracts from vectors <= 64-bits are lowered to bit-arithmetic in
// legalization
if (SrcVecTy.getSizeInBits() <= 64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extend the test to cover size=64?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the tests for 64-bit vectors to llvm/test/CodeGen/AIE/GlobalISel/prelegalizercombiner-extract-vector-elt.mir in the first commit, they now show the new behavior in this commit.

Note however that we do not support 64-bit src operands in the legalizer yet. I don't want to add even more changes to this PR, handling all missing vector sizes should come in a separate PR.


const Register NewDstReg = MRI.createGenericVirtualRegister(NewVecTy);
MIRBuilder.buildInstr(MI.getOpcode(), {NewDstReg},
{SrcReg0, NewSrcReg1, NewSrcReg2}, MI.getFlags());

const unsigned NumPadElts = (512 / DstVecSize) - 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment describing the magic number 512

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, give it a sensible name and get it from TII.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just use the already existing MaxBitSize, missed it before

@konstantinschwarz konstantinschwarz force-pushed the kschwarz.rework.extractelem.handling branch from 81a7627 to f0fa0d6 Compare February 3, 2025 22:21
Copy link
Collaborator Author

@konstantinschwarz konstantinschwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the reviews!
I think I addressed most/all of the comments. In particular, got rid of the hard-coded legal vector size

llvm/lib/Target/AIE/aie2p/AIE2PInstructionSelector.cpp Outdated Show resolved Hide resolved
%1:vregbank(<4 x s64>) = G_IMPLICIT_DEF
%2:vregbank(<8 x s64>) = G_CONCAT_VECTORS %0(<4 x s64>), %1(<4 x s64>)
PseudoRET implicit $lr, implicit %2
...
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we have very little G_CONCAT_VECTOR legalization test coverage. There's also a nice opportunity to move instruction selection code to the legalizer and make it common across architectures.
I'll add the missing legalizer test to this commit, but the re-work of G_CONCAT_VECTOR legalization strategy should come in a separate PR

@@ -995,6 +995,10 @@ bool llvm::matchExtractVecEltAndExt(
const LLT S8 = LLT::scalar(8);
const LLT S16 = LLT::scalar(16);
LLT SrcVecTy = MRI.getType(MI.getOperand(1).getReg());
// Extracts from vectors <= 64-bits are lowered to bit-arithmetic in
// legalization
if (SrcVecTy.getSizeInBits() <= 64)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the tests for 64-bit vectors to llvm/test/CodeGen/AIE/GlobalISel/prelegalizercombiner-extract-vector-elt.mir in the first commit, they now show the new behavior in this commit.

Note however that we do not support 64-bit src operands in the legalizer yet. I don't want to add even more changes to this PR, handling all missing vector sizes should come in a separate PR.

auto Extr = B.buildInstr(ExtractOpc, {LLT::scalar(32)}, {SrcVecReg, Cst});
const LLT DstElemTy = MRI.getType(SrcVecReg).getElementType();
auto Extr =
B.buildExtractVectorElementConstant(DstElemTy, SrcVecReg, *UniqOpIdx);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

llvm/lib/Target/AIE/AIELegalizerHelper.cpp Show resolved Hide resolved
llvm/lib/Target/AIE/AIELegalizerHelper.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AIE/AIE2LegalizerInfo.cpp Show resolved Hide resolved
if (MRI.getType(MI.getOperand(1).getReg()).getSizeInBits() != 512) {
bool IsLegalized =
MI.getParent()->getParent()->getProperties().hasProperty(
MachineFunctionProperties::Property::Legalized);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, added

martien-de-jong
martien-de-jong previously approved these changes Feb 4, 2025
Copy link
Collaborator

@martien-de-jong martien-de-jong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking

…64-bit

Keeping these as generic G_EXTRACT_VECTOR_ELTs allows us to use the existing bitcast legalization,
lowering such extracts into bit-arithmetic.
…eToExtractBroadcast

This is in preparation of only building legal sized (512-bit) G_AIE_[SZ]EXT_EXTRACT_VECTOR_ELT.
We start to duplicate similar code across different legalization functions.
G_AIE_[ZS]EXT_EXTRACT_VECTOR_ELT are typically introduced in the pre-legalizer combiner.
Since target specific opcodes are considered legal by the Legalizer pass, we need to
pre-legalize these instructions in the combiner.
We already custom legalized every G_EXTRACT_VECTOR_ELT into G_AIE_SEXT_VECTOR_ELT.
This change ensures we are only producing legal G_AIE_SEXT_VECTOR_ELT by re-using
the previously introduced legalization action.
This adds MachineVerifier support for G_AIE_[SZ]EXT_EXTRACT_VECTOR_ELT to enforce only 512-bit source vectors are used after legalization.
…ableGen patterns

For AIE2, this is NFC. For AIE2P, we now benefit from patterns supporting constant indices.
@konstantinschwarz konstantinschwarz force-pushed the kschwarz.rework.extractelem.handling branch from 36a756e to 67e97fc Compare February 4, 2025 16:01
@konstantinschwarz konstantinschwarz enabled auto-merge (rebase) February 4, 2025 16:33
@konstantinschwarz konstantinschwarz merged commit cf9f33b into aie-public Feb 4, 2025
8 checks passed
@konstantinschwarz konstantinschwarz deleted the kschwarz.rework.extractelem.handling branch February 4, 2025 17:16
Copy link
Collaborator

@niwinanto niwinanto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work. Just one comment, not really part of this, but found while reviewing this.

@@ -1863,7 +1863,8 @@ static std::optional<int> getUniqueIndex(ArrayRef<int> Mask) {
/// %1:_(<4 x s64>) = COPY $wl1
/// %2:_(<8 x s64>) = G_SHUFFLE_VECTOR %X(<4 x s64>), %1(<4 x s64>),
/// shufflemask(3, 3, 3, 3, 3, 3, 3, 3)
/// To : %2:_(<8 x s64>) = G_AIE_BROADCAST_VECTOR %X(<4 x s64>)
/// To : %3:_(s64) = G_EXTRACT_VECTOR_ELT %X, 3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@konstantinschwarz It seems we are not checking the extract element belongs to first source element. Canonicalization combine from the upstream might solve this, but till then this combine produces wrong results. For example, if shuffle mask in the doc string is (4, 4, ...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants