-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pass to assign BD IDs to NPU DMA ops and refactor AMDAIEAssignBufferDescriptorIDsPass
#551
Conversation
/// Return a map from channels to valid BD ids for the requested tile type. | ||
/// TODO(jornt): find these ranges in the device model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's here https://github.com/Xilinx/aie-rt/blob/c28d57972957d8032839f4da5d2e3b9dd01805ea/driver/src/global/xaie2ipugbl_reginit.c#L389-L391 but they don't distinguish between even/odd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, they have the real numbers harcoded inside a verifier here: https://github.com/Xilinx/aie-rt/blob/1ad203de0b7f282b1c0659fd2ae3f218652c7274/driver/src/dma/xaie_dma_aieml.c#L1216
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can send them a PR to factor that out (but also that's a good util to use somewhere around here too)
compiler/plugins/target/AMD-AIE/iree-amd-aie/Utils/ChannelBdIdGenerator.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/lower_to_aie.mlir
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIELowerToAIE.cpp
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIEAssignNpuDmaBdIds.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIEAssignNpuDmaBdIds.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly fine - a few nittish type things to debate - but can you put a description somewhere (eg this PR itself) for the rules for assigning bdids re npuDmaOp.getSourceMemorySpaceAsUInt()
and npuDmaOp.getTargetMemorySpaceAsUInt()
.
npuDmaOp.getTargetMixedSizes(), npuDmaOp.getTargetMixedStrides(), | ||
npuDmaOp.getSourceMixedOffsets(), npuDmaOp.getSourceMixedSizes(), | ||
npuDmaOp.getSourceMixedStrides(), nullptr, bdIdOp); | ||
} else if (npuDmaOp.getTargetMemorySpaceAsUInt() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these conditionals aren't mutually exclusive (I don't think? maybe the first two are but the last one definitely isn't, at least not according to first principles) so I would prefer if each conditional were separate and each just did a return WalkResult::advance();
at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these conditionals aren't mutually exclusive
In principal you could create a NPU DMA operation with both source and target on L3/DDR, but that doesn't make much sense in the context of AIE. However, maybe this is not a good place to bake in any of those 'assumptions' and I can make it two separate if
statements, but with a single return WalkResult::advance();
at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally like early returns in ifs because then you don't have to hunt for the exit. And also it prevents multiple ifs firing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed offline
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIEAssignNpuDmaBdIds.cpp
Outdated
Show resolved
Hide resolved
// towards other DMAs and future hardware generations, channel | ||
// assignment should happen before BD assignemnt. This requires more | ||
// refactoring. | ||
std::optional<uint32_t> bdId = generator.getAndAssignBdId(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess the next step is to put channel
on NpuDmaCpyNdOp
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think channel
belongs on the CircularDmaCpyNd
operations which are referenced by the NpuDmaCpyNd
ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm really? I guess I haven't been paying attention to these ops. That's different from mlir-aie (obv) so that's why I was expecting a channel on this op.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The connections (routing, channel/port selection) is set up based on the CircularDmaCpyNd
ops after lowering to AIE dialect. As it's set up now, the NPU instructions shouldn't change the channels/DMA ports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird to me (again only because coming from the mlir-aie perspective) but we can discuss it later/elsewhere.
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIEAssignNpuDmaBdIds.cpp
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIEAssignNpuDmaBdIds.cpp
Show resolved
Hide resolved
e6a55a6
to
7e88dea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// towards other DMAs and future hardware generations, channel | ||
// assignment should happen before BD assignemnt. This requires more | ||
// refactoring. | ||
std::optional<uint32_t> bdId = generator.getAndAssignBdId(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird to me (again only because coming from the mlir-aie perspective) but we can discuss it later/elsewhere.
npuDmaOp.getTargetMixedSizes(), npuDmaOp.getTargetMixedStrides(), | ||
npuDmaOp.getSourceMixedOffsets(), npuDmaOp.getSourceMixedSizes(), | ||
npuDmaOp.getSourceMixedStrides(), nullptr, bdIdOp); | ||
} else if (npuDmaOp.getTargetMemorySpaceAsUInt() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed offline
Adds a pass to generate BD IDs for NPU DMA operations that avoids race conditions with BDs being reconfigured during execution by some DMA.
This only assigns BD IDs on the L3 side of NPU DMA operations as the backend can currently only handle Shim DMA instructions and MemTile/core DMA configurations are assumed to be part of the xclbin. Therefore, you will see the checks like
npuDmaOp.getSourceMemorySpaceAsUInt() == 0
andnpuDmaOp.getTargetMemorySpaceAsUInt() == 0
being used (memory space0
is L3).BD IDs are represented with an
amdaie.bd_id
operation, which takes a tile and an integer value as operands as every tile has its own set of BDs. This helps with guaranteeing/verifying correct reuse of the same id across DMA operations. This usage was derived from Xilinx/mlir-aie#1610