.. _xrt_native_apis.rst: XRT Native APIs =============== From 2020.2 release XRT provides a new XRT API set in C, C++, and Python flavor. To use the native XRT APIs, the host application must link with the **xrt_coreutil** library. Compiling host code with XRT native C++ API requires C++ standard with -std=c++14 or newer. On GCC version older than 4.9.0, please use -std=c++1y instead because -std=c++14 is introduced to GCC from 4.9.0. Example g++ command .. code-block:: shell g++ -g -std=c++14 -I$XILINX_XRT/include -L$XILINX_XRT/lib -o host.exe host.cpp -lxrt_coreutil -pthread The XRT native API supports both the C and C++ flavor of APIs. For general host code development, C++-based APIs are recommended, hence this document only describes the C++-based API interfaces. The full Doxygen generated C and C++ API documentation can be found in `<./xrt_native.main.rst>`_. The C++ Class objects used for the APIs are +----------------------+-------------------+------------------------------------------------+ | | C++ Class | Header files | +======================+===================+================================================+ | Device | ``xrt::device`` | ``#include `` | +----------------------+-------------------+------------------------------------------------+ | XCLBIN | ``xrt::xclbin`` | ``#include `` | +----------------------+-------------------+------------------------------------------------+ | Buffer | ``xrt::bo`` | ``#include `` | +----------------------+-------------------+------------------------------------------------+ | Kernel | ``xrt::kernel`` | ``#include `` | +----------------------+-------------------+------------------------------------------------+ | Run | ``xrt::run`` | ``#include `` | +----------------------+-------------------+------------------------------------------------+ | User-managed Kernel | ``xrt::ip`` | ``#include `` | +----------------------+-------------------+------------------------------------------------+ | Graph | ``xrt::graph`` | ``#include `` | | | | | | | | ``#include `` | +----------------------+-------------------+------------------------------------------------+ Majority of the core data structures are defined inside in the header files at ``$XILINX_XRT/include/xrt/`` directory. Few newer features such as ``xrt::ip``, ``xrt::aie`` related header files are inside ``$XILINX_XRT/include/experimental`` directory. The API interfaces that are in the experimental folder are subject to breaking changes. The common host code flow using the above data structures is as below - Open Xilinx **Device** and Load the **XCLBIN** - Create **Buffer** objects to transfer data to kernel inputs and outputs - Use the Buffer class member functions for the data transfer between host and device (before and after the kernel execution). - Use **Kernel** and **Run** objects to offload and manage the compute-intensive tasks running on FPGA. Below we will walk through the common API usage to accomplish the above tasks. Device and XCLBIN ----------------- Device and XCLBIN class provide fundamental infrastructure-related interfaces. The primary objective of the device and XCLBIN related APIs are - Open a Device - Load compiled kernel binary (or XCLBIN) onto the device The simplest code to load an XCLBIN as below .. code:: c++ :number-lines: 10 unsigned int dev_index = 0; auto device = xrt::device(dev_index); auto xclbin_uuid = device.load_xclbin("kernel.xclbin"); The above code block shows - The ``xrt::device`` class's constructor is used to open the device (enumerated as 0) - The member function ``xrt::device::load_xclbin`` is used to load the XCLBIN from the filename. - The member function ``xrt::device::load_xclbin`` returns the XCLBIN UUID, which is required to open the kernel (refer the Kernel Section). The class constructor ``xrt::device::device(const std::string& bdf)`` also supports opening a device object from a Pcie BDF passed as a string. .. code:: c++ :number-lines: 10 auto device = xrt::device("0000:03:00.1"); The ``xrt::device::get_info()`` is a useful member function to obtain necessary information about a device. Some of the information such as Name, BDF can be used to select a specific device to load an XCLBIN .. code:: c++ :number-lines: 10 std::cout << "device name: " << device.get_info() << "\n"; std::cout << "device bdf: " << device.get_info() << "\n"; Buffers ------- Buffers are primarily used to transfer the data between the host and the device. The Buffer related APIs are discussed in the following three subsections 1. Buffer allocation and deallocation 2. Data transfer using Buffers 3. Miscellaneous other Buffer APIs 1. Buffer allocation and deallocation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The C++ interface for buffers as below The class constructor ``xrt::bo`` is mainly used to allocates a buffer object 4K aligned. By default, a regular buffer is created (optionally the user can create other types of buffers by providing a flag). .. code:: c++ :number-lines: 15 auto bank_grp_arg0 = kernel.group_id(0); // Memory bank index for kernel argument 0 auto bank_grp_arg1 = kernel.group_id(1); // Memory bank index for kernel argument 1 auto input_buffer = xrt::bo(device, buffer_size_in_bytes,bank_grp_arg0); auto output_buffer = xrt::bo(device, buffer_size_in_bytes, bank_grp_arg1); In the above code ``xrt::bo`` buffer objects are created using the class constructor. Please note the following - As no special flags are used a regular buffer will be created. Regular buffer is most common type of buffer that has a host backing pointer allocated by user space in heap memory and a device buffer allocated in the specified memory bank. - The second argument specifies the buffer size. - The third argument is used to specify the enumerated memory bank index (to specify the buffer location) where the buffer should be allocated. There are two ways to specify the memory bank index - Through kernel arguments: In the above example, the ``xrt::kernel::group_id()`` member function is used to pass the memory bank index. This member function accept kernel argument-index and automatically detect corresponding memory bank index by inspecting XCLBIN. - Passing Memory bank index: The ``xrt::kernel::group_id()`` also accepts the direct memory bank index (as observed from ``xbutil examine --report memory`` output). Creating special Buffers ************************ The ``xrt::bo()`` constructors accept multiple other buffer flags those are described using ``enum class`` argument with the following enumerator values - ``xrt::bo::flags::normal``: Regular buffer (default) - ``xrt::bo::flags::device_only``: Device only buffer (meant to be used only by the kernel, there is no host backing pointer). - ``xrt::bo::flags::host_only``: Host only buffer (buffer resides in the host memory directly transferred to/from the kernel) - ``xrt::bo::flags::p2p``: P2P buffer, A special type of device-only buffer capable of peer-to-peer transfer - ``xrt::bo::flags::cacheable``: Cacheable buffer can be used when the host CPU frequently accessing the buffer (applicable for edge platform). The below example shows creating a P2P buffer on a device memory bank connected to argument 3 of the kernel. .. code:: c++ :number-lines: 15 auto p2p_buffer = xrt::bo(device, buffer_size_in_byte,xrt::bo::flags::p2p, kernel.group_id(3)); Creating Buffers from the user pointer ************************************** The ``xrt::bo()`` constructor can also be called using a pointer provided by the user. The user pointer must be aligned to 4K boundary. .. code:: c++ :number-lines: 15 // Host Memory pointer aligned to 4K boundary int *host_ptr; posix_memalign(&host_ptr,4096,MAX_LENGTH*sizeof(int)); // Sample example filling the allocated host memory for(int i=0; i(); for (auto i=0;i(); auto out_bo_map = out_bo.map(); // Prepare input data std::copy(my_float_array,my_float_array+SIZE,inp_bo_map); in_bo.sync("in_sink", XCL_BO_SYNC_BO_GMIO_TO_AIE, SIZE * sizeof(float),0); out_bo.sync("out_sink", XCL_BO_SYNC_BO_AIE_TO_GMIO, SIZE * sizeof(float), 0); The above code shows - Input and output buffer (``in_bo`` and ``out_bo``) to the graph are created and mapped to the user space - The member function ``xrt::aie::bo::sync`` is used for data transfer using the following arguments - The name of the GMIO ports associated with the DMA transfer - The direction of the buffer transfer - GMIO to Graph: ``XCL_BO_SYNC_BO_GMIO_TO_AIE`` - Graph to GMIO: ``XCL_BO_SYNC_BO_AIE_TO_GMIO`` - The size and the offset of the buffer XRT Error API ------------- In general, XRT APIs can encounter two types of errors: - Synchronous error: Error can be thrown by the API itself. The host code can catch these exception and take necessary steps. - Asynchronous error: Errors from the underneath driver, system, hardware, etc. XRT provides an ``xrt::error`` class and its member functions to retrieve the asynchronous errors into the userspace host code. This helps to debug when something goes wrong. - Member function ``xrt::error::get_error_code()`` - Gets the last error code and its timestamp of a given error class - Member function ``xrt::error::get_timestamp()`` - Gets the timestamp of the last error - Member function ``xrt:error::to_string()`` - Gets the description string of a given error code. **NOTE**: The asynchronous error retrieving APIs are at an early stage of development and only supports AIE related asynchronous errors. Full support for all other asynchronous errors is planned in a future release. Example code .. code:: c++ :number-lines: 41 graph.run(runInteration); try { graph.wait(timeout); } catch (const std::system_error& ex) { if (ex.code().value() == ETIME) { xrt::error error(device, XRT_ERROR_CLASS_AIE); auto errCode = error.get_error_code(); auto timestamp = error.get_timestamp(); auto err_str = error.to_string(); /* code to deal with this specific error */ std::cout << err_str << std::endl; } else { /* Something else */ } } The above code shows - After timeout occurs from ``xrt::graph::wait()`` the member functions ``xrt::error`` class are called to retrieve asynchronous error code and timestamp - Member function ``xrt::error::to_string()`` is called to obtain the error string.