2020.1 Vitis™ Application Acceleration Development Flow Tutorials

See 2019.2 Vitis Application Acceleration Development Flow Tutorials

Getting Started with RTL Kernels

Tutorial Overview

RTL designs that fit certain software and hardware interface requirements can be packaged into a Xilinx Object (.xo) file. This file can be linked into a binary container to create an xclbin file that the host code application uses to program the kernel into the FPGA.

This tutorial provides the following reference files:

  • A simple vector accumulation example that performs a B[i] = A[i]+B[i] operation.

  • A host application, which interacts with the kernel using OpenCL APIs:

    • The host creates ready/write buffers to transfer the data between the host and the FPGA.

    • The host enqueues the RTL kernel (executed on the FPGA), which reads the buffer of the DDR, performs B[i] = A[i]+B[i], and then writes the result back to the DDR.

    • The host reads back the data to compare the results.

Using these reference files, the tutorial guides you from the first step of creating a Vitis™ IDE project to the final step of building and running your project.

Before You Begin

The labs in this tutorial use:

  • BASH Linux shell commands

  • 2020.1 Vitis core development kit release and the xilinx_u200_xdma_201830_2 platform. If necessary, it can be easily extended to other versions and platforms.


  • Before running any of the examples, make sure you have the Vitis core development kit installed as described in Installation.

  • If you run applications on Xilinx® Alveo™ Data Center accelerator cards, ensure the card and software drivers have been correctly installed by following the instructions on the Alveo Portfolio page.

Accessing the Tutorial Reference Files

  1. To access the reference files, type the following into a terminal: git clone https://github.com/Xilinx/Vitis-Tutorials.

  2. Navigate to getting-started-rtl-kernels directory, and then access the reference-files directory.

Requirements for Using an RTL Design as an RTL Kernel

To use an RTL kernel within the Vitis IDE, it must meet both the Vitis core development kit execution model and the hardware interface requirements as described in RTL Kernels in the in the Application Acceleration Development flow of the Vitis Unified Software Platform Documentation (UG1416).

Kernel Execution Model

RTL kernels uses the same software interface and execution model as C/C++ kernels. They are seen by the host application as functions with a void return value, scalar arguments, and pointer arguments. For instance:

void vadd_A_B(int *a, int *b, int scalar)

This implies that an RTL kernel has an execution model like a software function:

  • It must start when called.

  • It is responsible for processing data to provide the necessary results.

  • It must send a notification when processing is complete.

The Vitis core development kit execution model specifically relies on the following mechanics and assumptions:

  • Scalar arguments are passed to the kernel through an AXI4-Lite slave interface.

  • Pointer arguments are transferred through global memory (DDR, HBM, or PLRAM).

  • Base addresses of pointer arguments are passed to the kernel through its AXI4-Lite slave interface.

  • Kernels access pointer arguments in global memory through one or more AXI4 master interfaces.

  • Kernels are started by the host application through its AXI4-Lite interface.

  • Kernels must notify the host application when they complete the operation through its AXI4-Lite interface or a special interrupt signal.

Hardware Interface Requirements

To comply with this execution model, the Vitis core development kit requires that a kernel satisfies the following specific hardware interface requirements:

  • One and only one AXI4-Lite slave interface used to access programmable registers (control registers, scalar arguments, and pointer base addresses).

    • Offset 0x00 - Control Register: Controls and provides kernel status

      • Bit 0: start signal: Asserted by the host application when kernel can start processing data. Must be cleared when the done signal is asserted.

      • Bit 1: done signal: Asserted by the kernel when it has completed operation. Cleared on read.

      • Bit 2: idle signal: Asserted by this signal when it is not processing any data. The transition from Low to High should occur synchronously with the assertion of the done signal.

    • Offset 0x04- Global Interrupt Enable Register: Used to enable interrupt to the host.

    • Offset 0x08- IP Interrupt Enable Register: Used to control which IP generated signal is used to generate an interrupt.

    • Offset 0x0C- IP Interrupt Status Register: Provides interrupt status

    • Offset 0x10 and above - Kernel Argument Register(s): Register for scalar parameters and base addresses for pointers.

  • One or more of the following interfaces:

    • AXI4 master interface to communicate with global memory.

      • All AXI4 master interfaces must have 64-bit addresses.

      • The kernel developer is responsible for partitioning global memory spaces. Each partition in the global memory becomes a kernel argument. The base address (memory offset) for each partition must be set by a control register programmable through the AXI4-Lite slave interface.

      • AXI4 masters must not use Wrap or Fixed burst types, and they must not use narrow (sub-size) bursts. This means that AxSIZE should match the width of the AXI data bus.

      • Any user logic or RTL code that does not conform to the requirements above must be wrapped or bridged.

    • AXI4-Stream interface to communicate with other kernels.

If the original RTL design uses a different execution model or hardware interface, you must add logic to ensure that the design behaves in the expected manner and complies with interface requirements.

Vector-Accumulate RTL IP

For this tutorial, the Vector-Accumulate RTL IP performing B[i]=A[i]+B[i] meets all the requirements described above and has the following characteristics:

  • Two AXI4 memory mapped interfaces:

    • One interface is used to read A

    • One interface is used to read and write B

    • The AXI4 masters used in this design do not use wrap, fixed, or narrow burst types.

  • An AXI4-Lite slave control interface:

    • Control register at offset 0x00

    • Kernel argument register at offset 0x10 allowing the host to pass a scalar value to the kernel

    • Kernel argument register at offset 0x18 allowing the host to pass the base address of A in global memory to the kernel

    • Kernel argument register at offset 0x24 allowing the host to pass the base address of B in global memory to the kernel

These specifications serve as the basis for building your own RTL Kernel from an existing RTL module, or serve as inputs to the RTL Kernel Wizard.

Next Steps

This tutorial demonstrates how to package RTL IPs as Vitis kernels (.xo), and use them in the Vitis core development kit. The tutorial offers two different approaches to accomplish this goal:

TIP: Start by packaging an existing RTL module as Vivado IP, and package that IP as a Vitis kernel (.xo). This is the Package IP/Package XO flow. Then use the RTL Kernel wizard to create the elements of an RTL kernel, and fit the existing RTL module into that framework. This is the RTL Kernel Wizard flow. The Package IP/Package XO flow is simpler and more direct, but the RTL Kernel Wizard flow offers more options.

After creating the RTL Kernel, you will use the Vitis IDE to test the kernel in a Vitis application project, in hardware emulation.

Return to Main Page

Copyright© 2020 Xilinx