2020.2 Vitis™ Application Acceleration Development Flow TutorialsSee 2020.1 Vitis Application Acceleration Development Flow Tutorials
Getting Started with RTL Kernels¶
RTL designs that fit certain software and hardware interface requirements can be packaged into a Xilinx Object (
.xo) file. This file can be linked into a binary container to create an
xclbin file that the host code application uses to program the kernel into the FPGA.
This tutorial provides the following reference files:
A simple vector accumulation example that performs a
B[i] = A[i]+B[i]operation.
A host application, which interacts with the kernel using OpenCL APIs:
The host creates ready/write buffers to transfer the data between the host and the FPGA.
The host enqueues the RTL kernel (executed on the FPGA), which reads the buffer of the DDR, performs
B[i] = A[i]+B[i], and then writes the result back to the DDR.
The host reads back the data to compare the results.
Using these reference files, the tutorial guides you from the first step of creating a Vitis™ IDE project to the final step of building and running your project.
Before You Begin¶
The labs in this tutorial use:
BASH Linux shell commands
2020.2 Vitis core development kit release and the xilinx_u200_xdma_201830_2 platform. If necessary, it can be easily extended to other versions and platforms.
Before running any of the examples, make sure you have the Vitis core development kit installed as described in Installation.
If you run applications on Xilinx® Alveo™ Data Center accelerator cards, ensure the card and software drivers have been correctly installed by following the instructions on the Alveo Portfolio page.
Accessing the Tutorial Reference Files¶
To access the reference files, type the following into a terminal:
git clone https://gitenterprise.xilinx.com/swm/Vitis-In-Depth-Tutorial.git.
Hardware_Accelerators/Feature_Tutorials/01-rtl_kernel_workflow/directory, and then access the
Requirements for Using an RTL Design as an RTL Kernel¶
To use an RTL kernel within the Vitis IDE, it must meet both the Vitis core development kit execution model and the hardware interface requirements as described in RTL Kernels in the in the Application Acceleration Development flow of the Vitis Unified Software Platform Documentation (UG1416).
Kernel Execution Model¶
RTL kernels uses the same software interface and execution model as C/C++ kernels. They are seen by the host application as functions with a void return value, scalar arguments, and pointer arguments. For instance:
void vadd_A_B(int *a, int *b, int scalar)
This implies that an RTL kernel has an execution model like a software function:
It must start when called.
It is responsible for processing data to provide the necessary results.
It must send a notification when processing is complete.
As described in Supported Kernel Execution Models, the Vitis core development kit execution models of
ap_ctrl_chain specifically rely on the following mechanics and assumptions:
Scalar arguments are passed to the kernel through an AXI4-Lite slave interface.
Pointer arguments are transferred through global memory (DDR, HBM, or PLRAM).
Base addresses of pointer arguments are passed to the kernel through its AXI4-Lite slave interface.
Kernels access pointer arguments in global memory through one or more AXI4 master interfaces.
Kernels are started by the host application through its AXI4-Lite interface.
Kernels must notify the host application when they complete the operation through its AXI4-Lite interface or a special interrupt signal.
ap_ctrl_noneexecution model does not rely on these features, and lets you establish streaming kernels that are self-starting and continuously running.
Hardware Interface Requirements¶
To comply with the
ap_ctrl_hs execution model, the RTL design in this tutorial satisfies the following specific hardware interface requirements:
One and only one AXI4-Lite slave interface used to access programmable registers (control registers, scalar arguments, and pointer base addresses).
0x00- Control Register: Controls and provides kernel status
0: start signal: Asserted by the host application when kernel can start processing data. Must be cleared when the done signal is asserted.
1: done signal: Asserted by the kernel when it has completed operation. Cleared on read.
2: idle signal: Asserted by this signal when it is not processing any data. The transition from Low to High should occur synchronously with the assertion of the done signal.
0x04- Global Interrupt Enable Register: Used to enable interrupt to the host.
0x08- IP Interrupt Enable Register: Used to control which IP generated signal is used to generate an interrupt.
0x0C- IP Interrupt Status Register: Provides interrupt status
0x10and above - Kernel Argument Register(s): Register for scalar parameters and base addresses for pointers.
One or more of the following interfaces:
AXI4 master interface to communicate with global memory.
All AXI4 master interfaces must have 64-bit addresses.
The kernel developer is responsible for partitioning global memory spaces. Each partition in the global memory becomes a kernel argument. The base address (memory offset) for each partition must be set by a control register programmable through the AXI4-Lite slave interface.
AXI4 masters must not use Wrap or Fixed burst types, and they must not use narrow (sub-size) bursts. This means that AxSIZE should match the width of the AXI data bus.
Any user logic or RTL code that does not conform to the requirements above must be wrapped or bridged.
AXI4-Stream interface to communicate with other kernels.
If the original RTL design uses a different execution model or hardware interface, you must add logic to ensure that the design behaves in the expected manner and complies with interface requirements.
Vector-Accumulate RTL IP¶
For this tutorial, the Vector-Accumulate RTL IP performing
B[i]=A[i]+B[i] meets all the requirements described above and has the following characteristics:
Two AXI4 memory mapped interfaces:
One interface is used to read A
One interface is used to read and write B
The AXI4 masters used in this design do not use wrap, fixed, or narrow burst types.
An AXI4-Lite slave control interface:
Control register at offset
Kernel argument register at offset
0x10allowing the host to pass a scalar value to the kernel
Kernel argument register at offset
0x18allowing the host to pass the base address of A in global memory to the kernel
Kernel argument register at offset
0x24allowing the host to pass the base address of B in global memory to the kernel
These specifications serve as the basis for building your own RTL Kernel from an existing RTL module, or serve as inputs to the RTL Kernel Wizard.
This tutorial demonstrates how to package RTL IPs as Vitis kernels (
.xo), and use them in the Vitis core development kit. The tutorial offers two different approaches to accomplish this goal:
Package IP/Package XO: Start by packaging an existing RTL module as Vivado IP, and package that IP as a Vitis kernel (
RTL Kernel Wizard: Use the RTL Kernel wizard to create the elements of an RTL kernel, and fit an existing RTL module into that framework.
Copyright© 2020 Xilinx