-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIEX] Combine G_SHUFFLE_VECTOR to UNMERGE #339
Conversation
7425bfd
to
f401e9f
Compare
|
||
const unsigned NumDstElems = DstTy.getNumElements(); | ||
const unsigned NumSrc1Elems = Src1Ty.getNumElements(); | ||
const unsigned NumSubVectors = NumSrc1Elems / NumDstElems; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You could move this assignment after the 2 following early returns
if (!DstTy.isVector() || !Src1Ty.isVector()) | ||
return false; | ||
|
||
// Boolean vectors are unlikely to select into subregister copy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this. Removed.
return std::nullopt; | ||
}; | ||
|
||
std::optional<unsigned> SubIdx = GetSubIdx(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: SubVecExtractIdx and GetSubVecExtractIdx
std::optional<unsigned> SubIdx = GetSubIdx(); | ||
|
||
// Not an extract pattern | ||
if (!SubIdx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a sub vector extract pattern, I think it could be an extract pattern but one that can't be selected to a sub reg copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see now that this doesn't only match to sub-register copies but also sub vector elements through vextract instructions, but still it could be a different extract pattern that doesn't map to either sub-reg copy or element extract.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It simply means that it is not an extract pattern for this specific match function
return false; | ||
|
||
// Currently, we cannot concat vectors of the size less than vector register | ||
// size. | ||
// Currently, we cannot concat vectors for the case when the size of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we cannot extract vectors...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and "..less than the basic vector register (of the target)"
; CHECK: liveins: $x0, $x1 | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<16 x s32>) = COPY $x0 | ||
; CHECK-NEXT: [[AIE_UNPAD_VECTOR:%[0-9]+]]:_(<8 x s32>) = G_AIE_UNPAD_VECTOR [[COPY]](<16 x s32>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe add a comment saying that this is a special case where g_unmerge is itself combined to g_aie_unpad_vector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also maybe add a test with an unmerge of 4 outputs and a bigger size so that you don't run into the G_AIE_UNPAD_VECTOR
case
; CHECK-NEXT: or r1, r1, r3 | ||
; CHECK-NEXT: or r0, r0, r2; vmsc dm0, dm1, x4, y3,r1 | ||
; CHECK-NEXT: or r0, r0, r3 | ||
; CHECK-NEXT: ret lr; vaddmsc dm0, dm0, dm2, x0, y1,r0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice code size reduction!
; CHECK-NEXT: PseudoRET implicit $lr, implicit [[AIE_UNPAD_VECTOR]](<8 x s32>) | ||
%1:_(<16 x s32>) = COPY $x0 | ||
%2:_(<16 x s32>) = COPY $x1 | ||
%0:_(<8 x s32>) = G_SHUFFLE_VECTOR %1(<16 x s32>), %2(<16 x s32>), shufflemask(0, 1, 2, 3, 4, 5, 6, 7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we test an output of 256 bits. More generally, it would be nice to also test one of 128, 512 and 1024 bits.
|
||
// This should be handled by a separate combine that copies Src1Reg to | ||
// DstReg. | ||
if (Src1TySize == DstTy.getSizeInBits()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: It would be nice to have a test for this case even if we don't combine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Only a few small suggestions and more tests would be nice.
/// %2:_(<16 x s32>) = COPY $x1 | ||
/// %0:_(<8 x s32>) = G_SHUFFLE_VECTOR %1(<16 x s32>), %2(<16 x s32>), | ||
/// shufflemask(8, 9, 10, 11, 12, 13, 14, 15) | ||
/// PseudoRET implicit $lr, implicit %0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to operate on the result, we probably want to keep it in a 512 bit register, and then this can be implemented with a shift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I think it should be done not in the combine, the combine should result in the same type as the original G_SHUFFLE_VECTOR
dbb4889
to
9e40fc1
Compare
9e40fc1
to
a4ff498
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
If we cannot combine
G_SHUFFLE_VECTOR
toG_AIE_EXTRACT_SUBVECTOR
, then we try to combine it intoUNMERGE