Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change vec_scal_add examples to vec_scal_mul and cleaned up README references #1400

Merged
merged 2 commits into from
Apr 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -100,4 +100,4 @@ clean: clean_trace

.PHONY: clean_trace
clean_trace:
rm -rf tmpTrace parse*.json
rm -rf tmpTrace parse*.json trace.txt
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 6 additions & 0 deletions programming_guide/quick_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,12 @@
| `print(ctx.module)` | Converts our ctx wrapped structural code to mlir and prints to stdout|
| `ctx.module.operation.verify()` | Runs additional structural verficiation on the python binded source code and return result to stdout |

## Common AIE API functions for Kernel Programming
| Function Signature | Definition | Parameters | Return Type | Example |
|---------------------|------------|------------|-------------|---------|
| `aie::vector<T, vec_factor> my_vector` | Declare vector type | `T`: data type <br> `vec_factor`: vector width | n/a | aie::vector<int16_t, 32> my_vector; |
| `aie::load_v<vec_factor>(pA1);` | Vector load | `vec_factor`: vector width | `aie::vector` | aie::vector<int16_t, 32> my_vector; |

## Helpful AI Engine Architecture References and Tables
* [AIE2 - Table of supported data types and vector sizes (AIE API)](https://www.xilinx.com/htmldocs/xilinx2023_2/aiengine_api/aie_api/doc/group__group__basic__types.html)

Expand Down
2 changes: 1 addition & 1 deletion programming_guide/section-1/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#
##===----------------------------------------------------------------------===##

include ../../tutorials/makefile-common
include ../../programming_examples/makefile-common

build/aie.mlir: aie2.py
mkdir -p ${@D}
Expand Down
6 changes: 5 additions & 1 deletion programming_guide/section-3/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,15 @@ all: build/final.xclbin build/insts.txt

targetname = vectorScalar

build/aie.mlir: aie2.py
mkdir -p ${@D}
python3 $< > $@

build/scale.o: vector_scalar_mul.cc
mkdir -p ${@D}
cd ${@D} && xchesscc_wrapper ${CHESSCCWRAP2_FLAGS} -c $(<:%=../%) -o ${@F}

build/final.xclbin: aie.mlir build/kernel1.o build/kernel2.o build/kernel3.o
build/final.xclbin: build/aie.mlir build/scale.o
mkdir -p ${@D}
cd ${@D} && aiecc.py --aie-generate-cdo --no-compile-host --xclbin-name=${@F} \
--aie-generate-npu --npu-insts-name=insts.txt $(<:%=../%)
Expand Down
1 change: 0 additions & 1 deletion programming_guide/section-3/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,6 @@ To compile the design and C++ testbench:

```sh
make
make build/vectorScalar.exe
```

To run the design:
Expand Down
14 changes: 3 additions & 11 deletions programming_guide/section-3/test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,10 @@ int main(int argc, const char *argv[]) {

test_utils::parse_options(argc, argv, desc, vm);
int verbosity = vm["verbosity"].as<int>();
int trace_size = vm["trace_sz"].as<int>();

constexpr bool VERIFY = true;
constexpr bool ENABLE_TRACING = false;
// constexpr int TRACE_SIZE = 8192;
constexpr int IN_SIZE = 4096;
constexpr int OUT_SIZE = ENABLE_TRACING ? IN_SIZE + trace_size / 4 : IN_SIZE;
constexpr int OUT_SIZE = IN_SIZE;

// Load instruction sequence
std::vector<uint32_t> instr_v =
Expand All @@ -64,7 +61,7 @@ int main(int argc, const char *argv[]) {
XRT_BO_FLAGS_HOST_ONLY, kernel.group_id(2));
auto bo_inFactor = xrt::bo(device, 1 * sizeof(DATATYPE),
XRT_BO_FLAGS_HOST_ONLY, kernel.group_id(3));
auto bo_outC = xrt::bo(device, OUT_SIZE * sizeof(DATATYPE) + trace_size,
auto bo_outC = xrt::bo(device, OUT_SIZE * sizeof(DATATYPE),
XRT_BO_FLAGS_HOST_ONLY, kernel.group_id(4));

if (verbosity >= 1)
Expand All @@ -85,7 +82,7 @@ int main(int argc, const char *argv[]) {

// Zero out buffer bo_outC
DATATYPE *bufOut = bo_outC.map<DATATYPE *>();
memset(bufOut, 0, OUT_SIZE * sizeof(DATATYPE) + trace_size);
memset(bufOut, 0, OUT_SIZE * sizeof(DATATYPE));

// sync host to device memories
bo_instr.sync(XCL_BO_SYNC_BO_TO_DEVICE);
Expand Down Expand Up @@ -120,11 +117,6 @@ int main(int argc, const char *argv[]) {
}
}

if (trace_size > 0) {
test_utils::write_out_trace(((char *)bufOut) + (IN_SIZE * sizeof(DATATYPE)),
trace_size, vm["trace_file"].as<std::string>());
}

// Print Pass/Fail result of our test
if (!errors) {
std::cout << std::endl << "PASS!" << std::endl << std::endl;
Expand Down
8 changes: 1 addition & 7 deletions programming_guide/section-3/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@
from aie.extras.dialects.ext import memref, arith

import aie.utils.test as test_utils
import aie.utils.trace as trace_utils


def main(opts):
Expand All @@ -41,7 +40,7 @@ def main(opts):
INOUT1_SIZE = INOUT1_VOLUME * INOUT1_DATATYPE().itemsize
INOUT2_SIZE = INOUT2_VOLUME * INOUT2_DATATYPE().itemsize

OUT_SIZE = INOUT2_SIZE + int(opts.trace_size)
OUT_SIZE = INOUT2_SIZE

# ------------------------------------------------------
# Get device, load the xclbin & kernel and register them
Expand Down Expand Up @@ -99,11 +98,6 @@ def main(opts):
e = np.equal(output_buffer, ref)
errors = errors + np.size(e) - np.count_nonzero(e)

# Write trace values if trace_size > 0
if opts.trace_size > 0:
trace_buffer = entire_buffer[INOUT2_VOLUME:]
trace_utils.write_out_trace(trace_buffer, str(opts.trace_file))

# ------------------------------------------------------
# Print verification and timing results
# ------------------------------------------------------
Expand Down
70 changes: 0 additions & 70 deletions programming_guide/section-4/CMakeLists.txt

This file was deleted.

74 changes: 0 additions & 74 deletions programming_guide/section-4/aie2.py

This file was deleted.

10 changes: 7 additions & 3 deletions programming_guide/section-4/section-4a/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,14 @@ build/aie.mlir: aie2.py
mkdir -p ${@D}
python3 $< > $@

build/final.xclbin: build/aie.mlir
build/scale.o: vector_scalar_mul.cc
mkdir -p ${@D}
cd ${@D} && aiecc.py --aie-generate-cdo --aie-generate-npu --no-compile-host \
--xclbin-name=${@F} --npu-insts-name=insts.txt ${<F}
cd ${@D} && xchesscc_wrapper ${CHESSCCWRAP2_FLAGS} -c $(<:%=../%) -o ${@F}

build/final.xclbin: build/aie.mlir build/scale.o
mkdir -p ${@D}
cd ${@D} && aiecc.py --aie-generate-cdo --no-compile-host --xclbin-name=${@F} \
--aie-generate-npu --npu-insts-name=insts.txt $(<:%=../%)

${targetname}.exe: test.cpp
rm -rf _build
Expand Down
5 changes: 1 addition & 4 deletions programming_guide/section-4/section-4a/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Adding the application timer is as simple as noting a start and stop time surrou

```c++
auto start = std::chrono::high_resolution_clock::now();
auto run = kernel(bo_instr, instr_v.size(), bo_inout0, bo_inout1, bo_inout2);
auto run = kernel(bo_instr, instr_v.size(), bo_inA, bo_inFactor, bo_outC);
run.wait();
auto stop = std::chrono::high_resolution_clock::now();

Expand Down Expand Up @@ -78,9 +78,6 @@ We can then compute and print the actual average, minimum and maximum run times.

1. Let's set our iterations to 10 and run again with `make run` which recompiles our host code for `test.cpp`. What reported Avg NPU time do you see this time? <img src="../../../mlir_tutorials/images/answer1.jpg" title="Answer can be anywhere from 430-480us but is likely different than before" height=25>

1. Let's change our design and increase the loop size of our kernel by a factor of 10. This involves changing the outer loop from 8 to 80. What reported times do you see now? <img src="../../../mlir_tutorials/images/answer1.jpg" title="? us" height=25>


-----
[[Up]](../../section-4) [[Next]](../section-4b)

Loading
Loading