-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhaul G_EXTRACT_VECTOR_ELT legalization strategy #328
Overhaul G_EXTRACT_VECTOR_ELT legalization strategy #328
Conversation
cb955c7
to
81a7627
Compare
case AIE2::G_AIE_ZEXT_EXTRACT_VECTOR_ELT: | ||
case AIE2::G_AIE_SEXT_EXTRACT_VECTOR_ELT: { | ||
const LLT SrcVecTy = MRI.getType(MI.getOperand(1).getReg()); | ||
if (SrcVecTy.getSizeInBits() != 512) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have a BasicVectorBitSize in a base class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe just a static attribute within AIE2PreLegalizerCombinerImpl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have the magic number 512 in multiple files...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We recently introduced getExtractSubvecNativeSrcSize
in AIEBaseInstrInfo.h
.
I guess I could rename that function to getBasicVectorBitSize()
and use it in existing places and the change here.
assert(MI.getOpcode() == TargetOpcode::G_UNMERGE_VALUES); | ||
|
||
const int StartIdx = Regs.size(); | ||
const int NumResults = MI.getNumOperands() - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there an MI.defs() that we could iterate over?
const int StartIdx = Regs.size(); | ||
const int NumResults = MI.getNumOperands() - 1; | ||
Regs.resize(Regs.size() + NumResults); | ||
for (int I = 0; I != NumResults; ++I) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not < NumResults?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I have far too many comments for such an heroic effort. In general it looks good to me. I would mainly like to avoid uninitialized locals, and have const on declarations that have longer scope than just a few lines.
llvm/test/CodeGen/AIE/aie2p/GlobalIsel/prelegalizercombiner-extract-vector-elt.mir
Outdated
Show resolved
Hide resolved
llvm/test/CodeGen/AIE/aie2p/GlobalIsel/inst-select-concat-vectors.mir
Outdated
Show resolved
Hide resolved
@@ -995,6 +995,10 @@ bool llvm::matchExtractVecEltAndExt( | |||
const LLT S8 = LLT::scalar(8); | |||
const LLT S16 = LLT::scalar(16); | |||
LLT SrcVecTy = MRI.getType(MI.getOperand(1).getReg()); | |||
// Extracts from vectors <= 64-bits are lowered to bit-arithmetic in | |||
// legalization | |||
if (SrcVecTy.getSizeInBits() <= 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we extend the test to cover size=64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the tests for 64-bit vectors to llvm/test/CodeGen/AIE/GlobalISel/prelegalizercombiner-extract-vector-elt.mir
in the first commit, they now show the new behavior in this commit.
Note however that we do not support 64-bit src operands in the legalizer yet. I don't want to add even more changes to this PR, handling all missing vector sizes should come in a separate PR.
|
||
const Register NewDstReg = MRI.createGenericVirtualRegister(NewVecTy); | ||
MIRBuilder.buildInstr(MI.getOpcode(), {NewDstReg}, | ||
{SrcReg0, NewSrcReg1, NewSrcReg2}, MI.getFlags()); | ||
|
||
const unsigned NumPadElts = (512 / DstVecSize) - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment describing the magic number 512
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, give it a sensible name and get it from TII.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just use the already existing MaxBitSize
, missed it before
81a7627
to
f0fa0d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the reviews!
I think I addressed most/all of the comments. In particular, got rid of the hard-coded legal vector size
llvm/test/CodeGen/AIE/aie2p/GlobalIsel/inst-select-concat-vectors.mir
Outdated
Show resolved
Hide resolved
%1:vregbank(<4 x s64>) = G_IMPLICIT_DEF | ||
%2:vregbank(<8 x s64>) = G_CONCAT_VECTORS %0(<4 x s64>), %1(<4 x s64>) | ||
PseudoRET implicit $lr, implicit %2 | ||
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we have very little G_CONCAT_VECTOR legalization test coverage. There's also a nice opportunity to move instruction selection code to the legalizer and make it common across architectures.
I'll add the missing legalizer test to this commit, but the re-work of G_CONCAT_VECTOR legalization strategy should come in a separate PR
@@ -995,6 +995,10 @@ bool llvm::matchExtractVecEltAndExt( | |||
const LLT S8 = LLT::scalar(8); | |||
const LLT S16 = LLT::scalar(16); | |||
LLT SrcVecTy = MRI.getType(MI.getOperand(1).getReg()); | |||
// Extracts from vectors <= 64-bits are lowered to bit-arithmetic in | |||
// legalization | |||
if (SrcVecTy.getSizeInBits() <= 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the tests for 64-bit vectors to llvm/test/CodeGen/AIE/GlobalISel/prelegalizercombiner-extract-vector-elt.mir
in the first commit, they now show the new behavior in this commit.
Note however that we do not support 64-bit src operands in the legalizer yet. I don't want to add even more changes to this PR, handling all missing vector sizes should come in a separate PR.
auto Extr = B.buildInstr(ExtractOpc, {LLT::scalar(32)}, {SrcVecReg, Cst}); | ||
const LLT DstElemTy = MRI.getType(SrcVecReg).getElementType(); | ||
auto Extr = | ||
B.buildExtractVectorElementConstant(DstElemTy, SrcVecReg, *UniqOpIdx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if (MRI.getType(MI.getOperand(1).getReg()).getSizeInBits() != 512) { | ||
bool IsLegalized = | ||
MI.getParent()->getParent()->getProperties().hasProperty( | ||
MachineFunctionProperties::Property::Legalized); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing blocking
f0fa0d6
to
36a756e
Compare
…64-bit Keeping these as generic G_EXTRACT_VECTOR_ELTs allows us to use the existing bitcast legalization, lowering such extracts into bit-arithmetic.
…eToExtractBroadcast This is in preparation of only building legal sized (512-bit) G_AIE_[SZ]EXT_EXTRACT_VECTOR_ELT.
We start to duplicate similar code across different legalization functions.
G_AIE_[ZS]EXT_EXTRACT_VECTOR_ELT are typically introduced in the pre-legalizer combiner. Since target specific opcodes are considered legal by the Legalizer pass, we need to pre-legalize these instructions in the combiner.
We already custom legalized every G_EXTRACT_VECTOR_ELT into G_AIE_SEXT_VECTOR_ELT. This change ensures we are only producing legal G_AIE_SEXT_VECTOR_ELT by re-using the previously introduced legalization action.
… in G_FADD/FSUB legalization
This adds MachineVerifier support for G_AIE_[SZ]EXT_EXTRACT_VECTOR_ELT to enforce only 512-bit source vectors are used after legalization.
…ableGen patterns For AIE2, this is NFC. For AIE2P, we now benefit from patterns supporting constant indices.
36a756e
to
67e97fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice work. Just one comment, not really part of this, but found while reviewing this.
@@ -1863,7 +1863,8 @@ static std::optional<int> getUniqueIndex(ArrayRef<int> Mask) { | |||
/// %1:_(<4 x s64>) = COPY $wl1 | |||
/// %2:_(<8 x s64>) = G_SHUFFLE_VECTOR %X(<4 x s64>), %1(<4 x s64>), | |||
/// shufflemask(3, 3, 3, 3, 3, 3, 3, 3) | |||
/// To : %2:_(<8 x s64>) = G_AIE_BROADCAST_VECTOR %X(<4 x s64>) | |||
/// To : %3:_(s64) = G_EXTRACT_VECTOR_ELT %X, 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@konstantinschwarz It seems we are not checking the extract element belongs to first source element. Canonicalization combine from the upstream might solve this, but till then this combine produces wrong results. For example, if shuffle mask in the doc string is (4, 4, ...)
We now legalize all
G_AIE_[ZS]EXT_EXTRACT_VECTOR_ELT/G_EXTRACT_VECTOR_ELT
to have 512-bit source vectors.This allows us to delete a lot of redundant instruction selection code and makes the code much more re-usable across AIE architectures.
Best to review commit by commit, each one should be self-contained.