Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIEX] Combine G_SHUFFLE_VECTOR to UNMERGE #339

Merged
merged 1 commit into from
Feb 10, 2025

Conversation

katerynamuts
Copy link
Collaborator

If we cannot combine G_SHUFFLE_VECTOR to G_AIE_EXTRACT_SUBVECTOR, then we try to combine it into UNMERGE


const unsigned NumDstElems = DstTy.getNumElements();
const unsigned NumSrc1Elems = Src1Ty.getNumElements();
const unsigned NumSubVectors = NumSrc1Elems / NumDstElems;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You could move this assignment after the 2 following early returns

if (!DstTy.isVector() || !Src1Ty.isVector())
return false;

// Boolean vectors are unlikely to select into subregister copy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this. Removed.

return std::nullopt;
};

std::optional<unsigned> SubIdx = GetSubIdx();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: SubVecExtractIdx and GetSubVecExtractIdx

std::optional<unsigned> SubIdx = GetSubIdx();

// Not an extract pattern
if (!SubIdx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a sub vector extract pattern, I think it could be an extract pattern but one that can't be selected to a sub reg copy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see now that this doesn't only match to sub-register copies but also sub vector elements through vextract instructions, but still it could be a different extract pattern that doesn't map to either sub-reg copy or element extract.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It simply means that it is not an extract pattern for this specific match function

return false;

// Currently, we cannot concat vectors of the size less than vector register
// size.
// Currently, we cannot concat vectors for the case when the size of the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we cannot extract vectors...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and "..less than the basic vector register (of the target)"

; CHECK: liveins: $x0, $x1
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<16 x s32>) = COPY $x0
; CHECK-NEXT: [[AIE_UNPAD_VECTOR:%[0-9]+]]:_(<8 x s32>) = G_AIE_UNPAD_VECTOR [[COPY]](<16 x s32>)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe add a comment saying that this is a special case where g_unmerge is itself combined to g_aie_unpad_vector

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also maybe add a test with an unmerge of 4 outputs and a bigger size so that you don't run into the G_AIE_UNPAD_VECTOR case

; CHECK-NEXT: or r1, r1, r3
; CHECK-NEXT: or r0, r0, r2; vmsc dm0, dm1, x4, y3,r1
; CHECK-NEXT: or r0, r0, r3
; CHECK-NEXT: ret lr; vaddmsc dm0, dm0, dm2, x0, y1,r0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice code size reduction!

; CHECK-NEXT: PseudoRET implicit $lr, implicit [[AIE_UNPAD_VECTOR]](<8 x s32>)
%1:_(<16 x s32>) = COPY $x0
%2:_(<16 x s32>) = COPY $x1
%0:_(<8 x s32>) = G_SHUFFLE_VECTOR %1(<16 x s32>), %2(<16 x s32>), shufflemask(0, 1, 2, 3, 4, 5, 6, 7)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we test an output of 256 bits. More generally, it would be nice to also test one of 128, 512 and 1024 bits.


// This should be handled by a separate combine that copies Src1Reg to
// DstReg.
if (Src1TySize == DstTy.getSizeInBits())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It would be nice to have a test for this case even if we don't combine.

Copy link
Collaborator

@khallouh khallouh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Only a few small suggestions and more tests would be nice.

/// %2:_(<16 x s32>) = COPY $x1
/// %0:_(<8 x s32>) = G_SHUFFLE_VECTOR %1(<16 x s32>), %2(<16 x s32>),
/// shufflemask(8, 9, 10, 11, 12, 13, 14, 15)
/// PseudoRET implicit $lr, implicit %0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to operate on the result, we probably want to keep it in a 512 bit register, and then this can be implemented with a shift.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I think it should be done not in the combine, the combine should result in the same type as the original G_SHUFFLE_VECTOR

@katerynamuts katerynamuts force-pushed the kmuts.shuffle-to-unmerge branch 2 times, most recently from dbb4889 to 9e40fc1 Compare February 10, 2025 08:33
@katerynamuts katerynamuts force-pushed the kmuts.shuffle-to-unmerge branch from 9e40fc1 to a4ff498 Compare February 10, 2025 08:36
Copy link
Collaborator

@khallouh khallouh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@katerynamuts katerynamuts merged commit a6c19a6 into aie-public Feb 10, 2025
8 checks passed
@katerynamuts katerynamuts deleted the kmuts.shuffle-to-unmerge branch February 10, 2025 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants