Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump cmakeModules #624

Merged
merged 1 commit into from
Sep 5, 2023
Merged

Bump cmakeModules #624

merged 1 commit into from
Sep 5, 2023

Conversation

fifield
Copy link
Collaborator

@fifield fifield commented Sep 5, 2023

The cmakeModules submodule was reverted to an earlier version in #573. I'm guessing this was inadvertent. This PR restores the submodule to the latest commit.

@fifield fifield requested a review from keryell September 5, 2023 19:55
@fifield fifield merged commit dae0939 into Xilinx:main Sep 5, 2023
5 checks passed
@fifield fifield deleted the cmake_submodule branch September 5, 2023 20:24
@keryell
Copy link
Member

keryell commented Sep 5, 2023

Thanks for fixing my mess!

fifield added a commit to fifield/mlir-aie that referenced this pull request Nov 8, 2023
* install target.h for the memory allocator as well (Xilinx#606)

This is redundant, but reflects the true dependencies better.

* Fix path used for tests

Peano should come before the regular vitis path, due to name collisions.

* Add basic AIE2 tests.

* [AIE] Add decoding of DMA status

The test library now does this for both AIE1 and AIE2 DMAs.

* [AIE] Add packet stream tests for ShimDMAs

This test looks at a common scenario where we have 3 tensors input
to a tile from 3 independent DMAs, but only 2 receiving tile DMAs.
Using packet routing, this scenario can be accommodated by
time-sharing one of the destination DMAs.

Obsoletes Xilinx#85

* Fix error message if no device.

We need to return with an error message, or later code will
segfault.

* Put exp lookup table into run_time_lib/AIE2 (Xilinx#604)

* Update chess_intrinsic_wrapper.cpp (Xilinx#610)

Remove event intrinsic declarations from AIEv1 wrapper

* Add TOSA tensor broadcast and mixed precision tests (Xilinx#609)

* Add the following TOSA integration tests to test/Integration/Dialect/TOSA/
* List of PASS tests:
i16xi16_add_elem (lane=32)
i16xi16_mul_elem (lane=32)
i16xi16_sel (lane=32)
i16xi16_sub_elem (lane=32)
i8xi8_add_elem (lane=64)
i8xi8_mul_elem (lane=32)
i8xi8_sel (lane=64)
i8xi8_sub_elem (lane=64)
bf16xbf16_sub_elem_2d_broadcast_1d (lane=16)
bf16xbf16_sub_elem_2d_broadcast_1d_reshape (lane=16)
* List of XFAIL tests:
i8xi16_sub_elem (lane=32)
bf16xbf16_sub_elem_2d_broadcast_2d (lane=16)
bf16xbf16_sub_elem_2d_broadcast_1d_unit_dim (lane=16)

* Fix include order (Xilinx#613)

* [aievec] Add hoisting patterns for arith.extsi

Hoisting cast operations as close as possible to the source of data can
make later patterns more robust to typical variations in the source
code.

We might need to revisit this one if, in the future, this process
causes unintended consequences.

* Implement inverse of a float by lookup tables (Xilinx#612)

* Fix some test failures (Xilinx#614)

* Move aiecc.py implementation to python library (Xilinx#387)

* Use correct macros for C API (Xilinx#615)

* capi

* reformat

* Re-export MLIR target set in CMake (Xilinx#617)

* mlirconfig

* reformat

* Disable `-Wno-unknown-warning-option` on windows (Xilinx#620)

* unknownwarning

* reformat

* win32 (Xilinx#618)

* Use new policy CMP0091 for MSVC (Xilinx#619)

* msvc

* reformat

* Revert "Use new policy CMP0091 for MSVC (Xilinx#619)" (Xilinx#622)

This reverts commit 1520898.

* Use upstream CMake macros to find python (Xilinx#616)

* cmake

* reformat

* Bump cmakeModules (Xilinx#624)

* ObjFifo unroll dependency fixes (Xilinx#621)

* Fixes for objFifo unrolling algorithm.

* EOF

* clang-format (Xilinx#625)

* Use target model functions to get number of DMA channels.

* Clang format

* Fix function call

* Add shim tiles that are not in noc columns in the getNumDestShimMuxConnections() functions.

* Add isShimNOCorPLTile () to the target model.

* Add missing target model.

* Add improvements to the doc

* Add isShimNOCorPLTile() virtual function.

* Clang format

---------

Co-authored-by: abisca <[email protected]>
Co-authored-by: Joseph Melber <[email protected]>

* Add missing function implementation in IPU target model.

* Fix aiecc configure path

* Revert path change.

* Update paths in xclbin generation.

* Typo

* Fixed aie unit tests.

* Fix aievec test.

* xfail failing test.

Co-authored-by: Stephen Neuendorffer <[email protected]>
Co-authored-by: Hanchen Ye <[email protected]>
Co-authored-by: Lina Yu <[email protected]>
Co-authored-by: Jeff Fifield <[email protected]>
Co-authored-by: James Lin <[email protected]>
Co-authored-by: Javier Setoain <[email protected]>
Co-authored-by: Maksim Levental <[email protected]>
Co-authored-by: Andra Bisca <[email protected]>
Co-authored-by: Joseph Melber <[email protected]>
fifield added a commit to fifield/mlir-aie that referenced this pull request Nov 8, 2023
* install target.h for the memory allocator as well (Xilinx#606)

This is redundant, but reflects the true dependencies better.

* Fix path used for tests

Peano should come before the regular vitis path, due to name collisions.

* Add basic AIE2 tests.

* [AIE] Add decoding of DMA status

The test library now does this for both AIE1 and AIE2 DMAs.

* [AIE] Add packet stream tests for ShimDMAs

This test looks at a common scenario where we have 3 tensors input
to a tile from 3 independent DMAs, but only 2 receiving tile DMAs.
Using packet routing, this scenario can be accommodated by
time-sharing one of the destination DMAs.

Obsoletes Xilinx#85

* Fix error message if no device.

We need to return with an error message, or later code will
segfault.

* Put exp lookup table into run_time_lib/AIE2 (Xilinx#604)

* Update chess_intrinsic_wrapper.cpp (Xilinx#610)

Remove event intrinsic declarations from AIEv1 wrapper

* Add TOSA tensor broadcast and mixed precision tests (Xilinx#609)

* Add the following TOSA integration tests to test/Integration/Dialect/TOSA/
* List of PASS tests:
i16xi16_add_elem (lane=32)
i16xi16_mul_elem (lane=32)
i16xi16_sel (lane=32)
i16xi16_sub_elem (lane=32)
i8xi8_add_elem (lane=64)
i8xi8_mul_elem (lane=32)
i8xi8_sel (lane=64)
i8xi8_sub_elem (lane=64)
bf16xbf16_sub_elem_2d_broadcast_1d (lane=16)
bf16xbf16_sub_elem_2d_broadcast_1d_reshape (lane=16)
* List of XFAIL tests:
i8xi16_sub_elem (lane=32)
bf16xbf16_sub_elem_2d_broadcast_2d (lane=16)
bf16xbf16_sub_elem_2d_broadcast_1d_unit_dim (lane=16)

* Fix include order (Xilinx#613)

* [aievec] Add hoisting patterns for arith.extsi

Hoisting cast operations as close as possible to the source of data can
make later patterns more robust to typical variations in the source
code.

We might need to revisit this one if, in the future, this process
causes unintended consequences.

* Implement inverse of a float by lookup tables (Xilinx#612)

* Fix some test failures (Xilinx#614)

* Move aiecc.py implementation to python library (Xilinx#387)

* Use correct macros for C API (Xilinx#615)

* capi

* reformat

* Re-export MLIR target set in CMake (Xilinx#617)

* mlirconfig

* reformat

* Disable `-Wno-unknown-warning-option` on windows (Xilinx#620)

* unknownwarning

* reformat

* win32 (Xilinx#618)

* Use new policy CMP0091 for MSVC (Xilinx#619)

* msvc

* reformat

* Revert "Use new policy CMP0091 for MSVC (Xilinx#619)" (Xilinx#622)

This reverts commit 1520898.

* Use upstream CMake macros to find python (Xilinx#616)

* cmake

* reformat

* Bump cmakeModules (Xilinx#624)

* ObjFifo unroll dependency fixes (Xilinx#621)

* Fixes for objFifo unrolling algorithm.

* EOF

* clang-format (Xilinx#625)

* Use target model functions to get number of DMA channels.

* Clang format

* Fix function call

* Add shim tiles that are not in noc columns in the getNumDestShimMuxConnections() functions.

* Add isShimNOCorPLTile () to the target model.

* Add missing target model.

* Add improvements to the doc

* Add isShimNOCorPLTile() virtual function.

* Clang format

---------

Co-authored-by: abisca <[email protected]>
Co-authored-by: Joseph Melber <[email protected]>

* need ONLY option to make cmake find numpy (Xilinx#630)

* Split ccache database according to the parallel jobs (Xilinx#600)

This fixes a race condition in the ccache database writing happening
at the end of each job running in parallel.
By using a unique key per job, each database is correctly written and
can be used for a next CI run.
Use also the real LLVM commit hash used by ccache database key in CI
instead of previous hack assuming the textual commit was present inside
utils/clone-llvm.sh.

* Fix TOSA broadcast and mixed precision tests (Xilinx#631)

Fix the following TOSA tests:
- bf16xbf16_sub_elem_2d_broadcast_2d
- i8xi16_sub_elem
Add the following new TOSA tests:
- i16xi16_sub_elem_2d_broadcast_scalar (pass)
- i16xi16_sub_elem_2d_broadcast_1d_unit_dim (pass)
- bf16xbf16_sub_elem_2d_broadcast_scalar (xfail)

* Fix ordering of putStream intrinsic.

The argument order for the intrinsic didn't match.
Argument 0: channel #
Argument 1: value

* Fix decoding of tile status for stream stalls.

These were just non-sensical.

* Add end-to-end tests for CPU stream access.

* Fix intrinsic wrapper for aie2 acquire/release

* Explictly compile intrinsic-dependent code with
chess frontend.

* [tests] Remove address.

This address is ignored, resulting in a warning.

* catch up to TOM MLIR (Xilinx#590)

* catch up to llvm TOM

* Update VectorToAIEVecConversions.cpp

* Get `VectorType` instead of `Type`

* format

* xfail opaque pointer related tests and update test

* update finalize-memref-to-llvm

---------

Co-authored-by: Javier Setoain <[email protected]>

* Add softmax test cases (Xilinx#635)

* Revised the xchess compilation commands for lut test cases. (Xilinx#636)

* Add more combined precision tosa tests (Xilinx#637)

Add the following passing element-wise tosa tests:
- i32xi32_add_elem (lane=32)
- i32xi32_mul_elem (lane=16)
- i32xi32_sel (lane=16)
- i32xi32_sub_elem (lane=32)

Add the following passing combined precision element-wise tosa tests:
- i8xi16_add_elem (lane=32)
- i8xi16_sub_elem (lane=32)
- i8xi32_add_elem (lane=32)
- i8xi32_sub_elem (lane=32)
- i16xi32_add_elem_v16 (lane=16)
- i16xi32_sub_elem_v16 (lane=16)

Add the following XFAIL combined precision element-wise tosa tests:
- i16xi32_add_elem_v32 (lane=32)
- i16xi32_sub_elem_v32 (lane=32)

* [aievec] Generalize vector passes

Right now, vectorization passes hook to FuncOp, which prevents
conversion to AIEVec within other top level operations, like AIE.device
ops.

This patch makes all passes generic and allows for conversion within
AIE.device.

* Implement tanh(x) based on linear approximation lookup tables (Xilinx#639)

* Refactor conversion of aievec.mul_elem to support combined precision (Xilinx#643)

* Refactor AIE-ML acc datatype emission
* Refactor arith.muli/mulf to aievec.mul_elem conversion pattern to make it extensible and clean
  - Reorganize the existing case-by-case patterns and decouple the pattern that requires two inputs to be the same type
  - Make it a cleaner pattern considering lhs/rhs/out datatype
  - Verified that all the dut.cc are identical before/after the refactor
* Add convertValueToTargetTypeAieML() which can be helpful for handling the vector lane mismatch issue later on.
* Add CPP emission for aievec.unpack op
* Add VectorToAIEVec lit tests to cover the lowering patterns
* Add new combined precision tosa tests for element-wise multiply:
  - i8xi16_mul_elem_v32 (out=i32, lane=32) (cycle count=144, PM=272), PASS
  - i8xi16_mul_elem_v16 (out=i32, lane=16) (cycle count=792, PM=368), XFAIL
    - No intent to work on this at the moment, but keep a record there
  - i16xi32_mul_elem (out=i32, lane=16) (cycle count=408, PM=384), PASS
  - i8xi32_mul_elem (out=i32, lane=16) (cycle count=728, PM=368), PASS

* Compute memref sizes by multiplying all shape sizes. (Xilinx#641)

Co-authored-by: abisca <[email protected]>

* [aievec][nfc] Clean-up aievec to llvm conversion

This code needed updating its use of a couple of constructs, and
namespaces.

* Add tosa-to-tensor pass to fix regression (Xilinx#645)

* Add tosa-to-tensor pass to fix regression of tosa broadcast tests

* Convert math.sqrt to a function call getSqrtBf16() for v16bfloat16 and v32bfloat16 types (Xilinx#646)

* Add comments for sqrt.h (Xilinx#648)

* Adding more tosa tests for combined precision inputs and broadcast (Xilinx#650)

* Add floatxfloat_sub_elem tosa test
* Add floatxfloat_add_elem tosa test
* Add floatxfloat_sel tosa test
* Add bf16xfloat_sub_elem tosa test
* Add bf16xfloat_add_elem tosa test
* Add i16xi16_sub_elem broadcast tests
* Add i8xi8_sub_elem broadcast tests
* Reorganize bf16xbf16 broadcast tosa tests
* Add floatxfloat_sub_elem broadcast tests
* Fix tosa lowering pipeline for bf16xbf16 sub_elem broadcast tests

* [aievec] Add missing conversion warnings for mac_elem and broadcast

This patch is a first step towards enabling AIEVec to LLVM Dialect
conversion for AIEml intrinsics.

* Add support of broadcast with vector width = 256 or 1024 and fix TOSA tests (Xilinx#653)

*Add support of broadcast_elem/broadcast_to_vxx for vector width == 256 (e.g. v16bf16) or 1024 (e.g. v32int32).
*Since we lower vector.broadcast op to multiple aievec ops, we have to fix FoldMulAddChainToConv pass to recognize the new aievec.broadcast patterns
*Add the following list of PASS tests for implicit broadcast:
i32xi32_sub_elem_16x1024_broadcast_1
i32xi32_sub_elem_2d_broadcast_1d_unit_dim_v16 (out=i32, lane=16)
i32xi32_sub_elem_2d_broadcast_1d_unit_dim_v32 (out=i32, lane=32)
i32xi32_sub_elem_2d_broadcast_scalar_v16 (out=i32, lane=16)
i32xi32_sub_elem_2d_broadcast_scalar_v32 (out=i32, lane=32)
i32xi32_sub_elem_16x1024_broadcast_1024
i32xi32_sub_elem_2d_broadcast_1d_reshape_v16 (out=i32, lane=16)
i32xi32_sub_elem_2d_broadcast_1d_reshape_v32 (out=i32, lane=32)
i32xi32_sub_elem_2d_broadcast_1d_v16 (out=i32, lane=16)
i32xi32_sub_elem_2d_broadcast_1d_v32 (out=i32, lane=32)
i32xi32_sub_elem_2d_broadcast_2d_v16 (out=i32, lane=16)
i32xi32_sub_elem_2d_broadcast_2d_v32 (out=i32, lane=32)
*Add dut.cc reference for bf16xbf16_sub_elem_16x1024_broadcast_1 tests. The resulting dut.cc is legal, but it's blocked by "broadcast_elem() of v32bfloat16" bug. Hence, the tests are still marked XFAIL.
*Add conversion test coverage for aievec.broadcast and aievec.broadcast_scalar in test_broadcast.mlir
*Fix i8xi16_mul_elem_v32 mlir script

* Convert tosa.erf and math.erf to a function call getErfBf16() for v16bfloat16 and v32bfloat16 types (Xilinx#652)

* Enable use of mlir pass manager in aiecc (Xilinx#628)

* Enable use of mlir pass manager in aiecc

* clang-format

* limit scope of mlir context, rebase

* fixup

* Revert "catch up to TOM MLIR (Xilinx#590)" (Xilinx#656)

This reverts commit 47ff7d3.

* Make pathfinder aware of the arch-specific routing constraints (Xilinx#657)

* Convert math.rsqrt to a function call getRsqrtBf16() for v16bfloat16 and v32bfloat16 types and reorganize files in aie_runtime_lib (Xilinx#655)

* Add more add/sub/mul mixed precision tests (Xilinx#659)

* Refactor the tosa-to-vector pipelines script in each test to a central place at test/Integration/lit.local.cfg for better maintainability. Also, make sure each .mlir test is running in a unique workdir for placing multiple .mlir test in a single directory.
* For add/sub/mul mixed precision tests, we add tests with swapped inputs
* Per the TOSA spec at https://www.mlplatform.org/tosa/tosa_spec.html#_mul, we add test coverage for i16xi16_mul_elem_i32, and i8xi8_mul_elem_i32. Our refactored mul_elem lowering pattern works on these two cases directly, and the acctype for the i8/i16 mac intrinsics we used is i32.

* Enable AIEX dialect bindings (Xilinx#658)

* Enable AIEX dialect bindings

* Replace 'Aie' prefix with 'AIE' in python cmake

* Fixs after merge

* Apply xca_udm_dbg workaround to new tests

* Change runner to xsj

* XFAIL some tests after merge

Co-authored-by: Stephen Neuendorffer <[email protected]>
Co-authored-by: Hanchen Ye <[email protected]>
Co-authored-by: Lina Yu <[email protected]>
Co-authored-by: James Lin <[email protected]>
Co-authored-by: Javier Setoain <[email protected]>
Co-authored-by: Maksim Levental <[email protected]>
Co-authored-by: Andra Bisca <[email protected]>
Co-authored-by: abisca <[email protected]>
Co-authored-by: Joseph Melber <[email protected]>
Co-authored-by: Kristof Denolf <[email protected]>
Co-authored-by: Ronan Keryell <[email protected]>
Co-authored-by: Javier Setoain <[email protected]>
Co-authored-by: erwei-xilinx <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants