From 23e21377b72bf444a43ac0aa8139f0ed6548be55 Mon Sep 17 00:00:00 2001 From: Kristof Denolf Date: Wed, 24 Apr 2024 15:41:53 -0600 Subject: [PATCH] add note on low level intrinsics --- programming_guide/section-4/section-4c/README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/programming_guide/section-4/section-4c/README.md b/programming_guide/section-4/section-4c/README.md index 05b96ebf1d..15dbfd8fac 100644 --- a/programming_guide/section-4/section-4c/README.md +++ b/programming_guide/section-4/section-4c/README.md @@ -113,6 +113,8 @@ void scale_vectorized(T *a, T *c, int32_t factor, const int32_t N) { In this first example, the vectorization strategy was relatively straight forward. Instead of iterating over a vector of values and doing a single scalar multiply, we load a vector of input values, iterate over a smaller loop to perform a vector*scalar operation using the AIE API functions, and then store the vector of results back to local memory. +> **NOTE** - AIE API is a portable programming interface that is implemented as a C++ header-only library providing types and operations that get translated into generation specific efficient low-level intrinsics. AIE kernels can also be programmed directly in these low-level C++ intrinsics: [AIE1 Intrinsics User Guide - v2023.2](https://www.xilinx.com/htmldocs/xilinx2023_2/aiengine_intrinsics/intrinsics/index.html) and [AIE2 Intrinsics User Guide - v2023.2](https://www.xilinx.com/htmldocs/xilinx2023_2/aiengine_ml_intrinsics/intrinsics/index.html) + ## Vectorization Exercises 1. Let's take a look at the trace for our vector scalar design. First, let's edit our [vector_scalar_mul design](../../../programming_examples/basic/vector_scalar_mul/) so that the [aie2.py](../../../programming_examples/basic/vector_scalar_mul/aie2.py) source file has `vectorized=False`. In the [aie2.py](../../../programming_examples/basic/vector_scalar_mul/aie2.py) sourcee code, we simply select the scalar version of the kernel function. Then run `make trace`. After the trace compilation is complete, open `trace_vs.json` in https://ui.perfetto.dev and measure the delta between `event 0` and `event 1`. Note that in the Perfetto waveform, 1 us is equal to 1 clock cycle. How many cycles did you measure?