AI Engine Development¶
The methodology for developing optimized accelerated applications is comprised of two major phases: architecting the application, and developing the kernels. In the first phase, you make key decisions about the application architecture by determining which software functions should be accelerated onto ACAP kernels, how much parallelism can be achieved, and how to deliver it in code. In the second phase, you implement the kernels by structuring the source code, and applying the necessary build optionss to create the kernel architecture needed to achieve the optimized performance target. The following examples illustrate the use of this methodology in real-world applications.
Design Tutorials¶
Tutorial |
Description |
---|---|
This tutorial uses the LeNet algorithm to implement a system-level design to perform image classification using the AI Engine and PL logic, including block RAM (BRAM). The design demonstrates functional partitioning between the AI Engine and PL. It also highlights memory partitioning and hierarchy among DDR memory, PL (BRAM) and AI Engine memory. |
|
Provides a methodology enabling you to make appropriate choices depending on the filter characteristics. Provides examples on how to implement Super Sampling Rate (SSR) FIR Filters on a Versal ACAP AI Engine processor array. |
|
This tutorial demonstrates the creation and emulation of an AIE design including the Adaptive DataFlow (ADF) graph, RTL kernels, and a custom VCK190 platform. |
Feature Tutorials¶
Tutorial |
Description |
---|---|
This tutorial introduces a complete end to end flow for a bare-metal host application using AI Engines and PL kernels. |
|
Introduces the usage of global memory I/O (GMIO) for sharing data between the AI Engines and external DDR. |
|
Learn how to dynamically update AI Engine runtime parameters. |
|
This tutorial illustrates how to use data packet switching with AI Engine designs to optimize efficiency. |
|
This tutorial demonstrates creating a system design running on the AI Engine, PS, and PL. Validate the design running on these heterogeneous domains by running Hardware Emulation. |
|
This tutorial demonstrates clocking concepts for the Vitis compiler. Define clocking for ADF graph PL kernels and PLIO kernels, using the clocking automation functionality. |
|
These examples demonstrate floating-point vector computations in the AI Engine. |
|
This tutorial demonstrates how to debug a multi-processor application using the Versal ACAP AI Engines, using a beamformer example design. The tutorial illustrates functional debug and performance level debug techniques. |
|
This tutorial shows how to design AI Engine applications using Model Composer. This set of blocksets for Simulink is used to demonstrates integrating RTL/HLS blocks for the Programmable Logic, as well as AI Engine blocks for the AI Engine array. |
|
This tutorial demonstrates how you can use the Vivado logic simulator (XSIM) waveform GUI, and the Vitis analyzer to debug and analyze your design for a Versal ACAP. |
|
This tutorial shows how to use AXI Traffic Generators to provide input and capture output from an AI Engine kernel in hardware emulation. |