Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Reorganize AIELegalizerInfo #159

Merged
merged 4 commits into from
Aug 21, 2024

Conversation

konstantinschwarz
Copy link
Collaborator

This factors out the custom legalization actions to be re-usable by AIE specific LegalizerInfo objects.
Moreover, this splits this AIELegalizerInfo object into AIE1LegalizerInfo and AIE2LegalizerInfo objects to get rid of numerous ST.isAIE2() checks, which make the code hard to read and difficult to port to a new architecture version.

There will be some duplication in the declarative legalization action specification for each architecture.

virtual unsigned getGenericPadVectorOpcode() const { return -1; }
/// Return the opcode to be used for extracting a smaller vector by ignoring
/// the high bits
virtual unsigned getGenericUnpadVectorOpcode() const { return -1; }
Copy link
Collaborator

@gbossu gbossu Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I feel we are really overloading InstrInfo with all those getXXXXopcode() function. At least for generic instrucitons, I'm wondering if we should not create e.g.

class AIEGenericInstructionsInterface {
  virtual unsigned getInsertVectorEltOpcode() const {...}
  virtual unsigned getExtractVectorEltOpcode() const {...}
  ...
}

I guess sometimes we will only really need to pass AIEGenericInstructionsInterface, and not the whole AIEBaseInstrInfo. This will also make unit testing easier, because we could just define a test interface for what is required to be tested.

What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see an issue with adding more of these getter functions to InstrInfo, its a single place where we collect all architecture specific opcode information.

Now for AIE specific generic opcodes we could also consider them to be always available on every AIE architecture, and requiring them to be defined as the first opcodes after GENERIC_OP_END.
Then it doesn't matter which AIE namespace we use to access those opcodes (AIE:: or AIE2:: at this point).
We would have to be extremely careful not to introduce an opcode in between GENERIC_OP_END and the shared opcodes only for a single AIE version. Not sure that's any better

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for making it easier, I know that I made that mistake a couple of times and it hard to catch. An alternative design would be to define getOpcode that takes an opcode in the TargetOpcode space and turns it into AIE1, AIE2 or whatever.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that these "target specific generic machine instructions" do not exist in the TargetOpcode namespace, so we cannot translate them into the target namespace.

TargetOpcode:: only contains the instructions up to GENERIC_OP_END, which also marks the beginning of the target specific (generic) machine instructions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but the idea could have a generic namespace that is an amalgamation of the different architectures. That has different drawbacks, like keeping it up to date and it becoming very large, but it is an alternative. Both are good to me and I happy with his as well

@andcarminati
Copy link
Collaborator

I think Reorganization can be used instead of Refactor because we are doing something like the opposite here. Also, I like the idea of splitting the code because it makes our life easier when we write aie2 legalization rules (we don't need to think about the disjoint behavior of the two versions).

niwinanto
niwinanto previously approved these changes Aug 20, 2024
Copy link
Collaborator

@niwinanto niwinanto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@konstantinschwarz konstantinschwarz changed the title RFC: Refactor AIELegalizerInfo RFC: Reorganize AIELegalizerInfo Aug 21, 2024
Copy link
Collaborator

@ValentijnvdBeek ValentijnvdBeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Konstantin,

Thanks for doing this work, it was something that was on my todo list that I didn't get time to do. Since I did spent a decent chunk of time spelunking in the legalizer, I thought it might be useful for me to chime in.

It is a good reorganization, I am happy to see the code being split up and the if statements being removed. There is some ground to be gained in terms of duplication, since this move will cause a decent amount of copying back and forth. As shown in #160, the legalizer isn't necessarily feature complete for each opcode yet, which means that both legalizers would need to be updated to deal with changes like this. So I am unsure if the DRY-complexity tradeoff here is optimal.

As to fix it, there are two ideas that I have toyed with and also put here as review comments:

  1. Move the type definitions into a separate header that either imports the right subheader or does a few if guards. Basic types will likely be shared and this has the extra benefit of just being able to import aieX_types.h through the GISel stack. Now these definitions are given ad-hoc when they are needed.

  2. Reuse the same trick you use for the opcodes and use that for the legalizer statements themselves. This is the template pattern, where each opcode is a function call that the subclass can override when it is needed. This gets rid of both the isAIE calls, but allows you define "default" rules.

But, besides that, great work!

TL;DR This is a good refactor that solves a decent of the problems, but I think that a bit more can be won by separating the LLT variables and applying the template pattern

llvm/lib/Target/AIE/AIELegalizerHelper.h Show resolved Hide resolved
llvm/lib/Target/AIE/AIELegalizerInfo.cpp Outdated Show resolved Hide resolved
virtual unsigned getGenericPadVectorOpcode() const { return -1; }
/// Return the opcode to be used for extracting a smaller vector by ignoring
/// the high bits
virtual unsigned getGenericUnpadVectorOpcode() const { return -1; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for making it easier, I know that I made that mistake a couple of times and it hard to catch. An alternative design would be to define getOpcode that takes an opcode in the TargetOpcode space and turns it into AIE1, AIE2 or whatever.

Comment on lines +31 to +67
const LLT S8 = LLT::scalar(8);
const LLT S16 = LLT::scalar(16);
const LLT S20 = LLT::scalar(20);
const LLT S32 = LLT::scalar(32);
const LLT S64 = LLT::scalar(64);
const LLT P0 = LLT::pointer(0, 20);

// 128-bit vectors
const LLT V16S8 = LLT::fixed_vector(16, 8);
const LLT V8S16 = LLT::fixed_vector(8, 16);
const LLT V4S32 = LLT::fixed_vector(4, 32);

// 256-bit vectors
const LLT V8S32 = LLT::fixed_vector(8, 32);
const LLT V16S16 = LLT::fixed_vector(16, 16);
const LLT V32S8 = LLT::fixed_vector(32, 8);

// 256-bit accumulators
const LLT ACC256 = LLT::fixed_vector(4, 64);

// 512-bit vectors
const LLT V16S32 = LLT::fixed_vector(16, 32);
const LLT V32S16 = LLT::fixed_vector(32, 16);
const LLT V64S8 = LLT::fixed_vector(64, 8);

// 512-bit accumulators
const LLT ACC512 = LLT::fixed_vector(8, 64);

// 1024-bit vectors
const LLT V32S32 = LLT::fixed_vector(32, 32);
const LLT V64S16 = LLT::fixed_vector(64, 16);
const LLT V128S8 = LLT::fixed_vector(128, 8);

// 1024-bit accumulators
const LLT ACC1024 = LLT::fixed_vector(16, 64);

const LLT S128 = LLT::scalar(128);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe put these in a separate header with if guards per generation? It would nice for these to be global in the file and these definitions will probably overlap quite a bit and you'd want to include it more parts of the instruction selection stack.


IMPLICIT.widenScalarToNextPow2(0).clampScalar(0, S32, S32);

getActionDefinitionsBuilder(G_CONSTANT)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These is a decent overlap of these definitions, which is a shame since it adds duplication. Maybe we can reuse the trick you used for the opcodes and define a base class with createXXXActionDefinitions calls which is overloaded by the architecture when needed. Then you only get the divergence when you need it.

Disclaimer: didn't test this yet

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consider this, e.g. through the common CRTP pattern.
I felt the added complexity isn't really worth it, much of the legalization is straight forward. Being able to share the custom legalization code seemed like a good compromise to me

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can see that. It is very much a DRY-code complexity trade off. My worry is mainly in having to deal with the duplication if the number of variants becomes very large. Then changes to the legalizer might require changes to 10 different files which need to be kept into sync. But that is a worry for the long term and something that can be dealt with then.

Copy link
Collaborator

@andcarminati andcarminati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Collaborator

@gbossu gbossu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for cleaning things up and preparing the future 🚀

@konstantinschwarz konstantinschwarz merged commit 061096c into aie-public Aug 21, 2024
8 checks passed
@konstantinschwarz konstantinschwarz deleted the kschwarz.refactor.legalizer branch August 21, 2024 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants