C API Programming Guide

Overview

The Xilinx Video SDK provides a C-based application programming interface (API) which facilitates the integration of Xilinx transcoding capabilities in proprietary frameworks. This API is provided in the form of plugins leveraging the Xilinx Media Accelerator (XMA) library and the Xilinx Resource Manager (XRM) library.

The XMA Library

The XMA library (libxmaapi) is meant to simplify the development of applications managing and controlling video accelerators such as decoders, scalers, filters, and encoders. The libxmaapi is comprised of two API interfaces: the lower-edge interface and the upper-edge interface:

  • The lower-edge API is an interface intended for plugin developers responsible for implementing hardware control of specific Xilinx acceleration kernels. These plugins are specialized user space drivers that are aware of the low-level interface of the hardware accelerators.

  • The upper-edge API is a higher-level, generalized interface intended for application developers responsible for integrating control of Xilinx accelerators into software frameworks such as FFmpeg, GStreamer, or proprietary frameworks.

The Xilinx Video SDK includes plugins optimized for the Xilix video accelerators such as the ones found on Alveo U30 cards. A software developer integrating the hardware-accelerated features of Xilinx devices in a proprietary framework only needs to be familiar with the XMA upper-edge API and the properties of each plugin.

The XMA library is included as part of the Xilinx Runtime (XRT) library. General documentation on XMA can be found in the XRT documentation. The XMA Upper Edge API Library section of the XRT documentation provides a complete reference of the XMA upper-edge API.

The XRM Library

The XRM library is used to manage the hardware accelerators available in the system. XRM keeps track of total system capacity for each of the compute units such as the decoder, scaler, and encoder. The XRM library makes it possible to perform actions such as reserving, allocating and releasing resources; calculating resource load and max capacity.

Details about XRM can be found in general XRM documentation.

The Xilinx Video SDK Plugins

The Xilinx Video SDK provides 4 different plugins, each corresponding to a specific hardware accelerated feature of the card:

  • The decoder plugin

  • The encoder plugin

  • The lookahead plugin

  • The scaler plugin

Any combination of plugins can be used when integrating with a proprietary framework.

Sample source code and applications using the Xilinx Video SDK plugins and the XMA APIs to do video encoding, decoding, scaling and transcoding can be found in the XMA Tutorials included in this repository.


General Application Development Guide

Integration layers for applications using the Xilinx Video SDK are organized around the following steps:

  1. Initialization

  2. Resource Reservation

  3. Session Creation

  4. Runtime Processing

  5. Cleanup

Initialization

Applications using the plugins must first create a XRM context using the xrmCreateContext() function in order to establish a connection with the XRM daemon. The application must then initialize the XMA library using the xma_initialize() function.

Examples of how to perform these two steps in order to initialize an application can be found in the common/src/xlnx_xrm_utils.c file which is shared by all XMA sample applications.

Further information about these APIs can be found in the online XRT and XRM documentation.

Resource Allocation

After the initialization step, the application must determine on which device to run and reserve the necessary hardware resources (CUs) on that device. This is done using the XRM APIs, as described in detail in the XRM API Reference Guide below.

Session Creation

Once the resources have been allocated, the application must create dedicated plugin sessions for each of the hardware accelerators that need to be used (decoder, scaler, encoder, lookahead).

To create a session, the application must first initialize all the required properties and parameters of the particular plugin. It must then call the corresponding session creation function. A complete reference for all the plugins is provided below.

Runtime Processing

The plugins provide functions to send data from the host and receive data from the device. The data is in the form of video frames (XmaFrame) or encoded video data (XmaFrameData), depending on the nature of the plugin. It is also possible to do zero-copy operations where frames are passed from one hardware accelerator to the next without being copied back to the host. The send and receive functions are specific to each plugin and the return code should be used to determine the next suitable action. A complete reference for all the plugins is provided below.

Cleanup

When the application finishes, it should destroy each plugin session using the corresponding destroy function. Doing so will free the resources on the Xilinx devices for other jobs and ensure that everything is released and cleaned-up properly.

The application should also use the xrmDestroyContext() function to destroy the XRM session, stop the connection to the daemon and ensure all resources are properly released.


Compiling and Linking with the Xilinx Video SDK Plugins

The plugins can be dynamically linked to the application. The required packages to build applications are XRT, XRM and XVBM. These packages provided as part of the Xilinx Video SDK.

To provide the necessary declarations in your application, include the following headers in your source code:

#include <xrm.h>
#include <xmaplugin.h>
#include <xma.h>
#include <xvbm.h>

To compile and link your application with the plugins, add the following lines to your Makefile:

CFLAGS  += $(shell pkg-config --cflags libxma2api libxma2plugin xvbm libxrm)
LDFLAGS += $(shell pkg-config --libs   libxma2api libxma2plugin xvbm libxrm)

These should add the following switches to your gcc commands:

-I/opt/xilinx/xrt/include/xma2 -I/opt/xilinx/xrt/include -I/opt/xilinx/xvbm/include -I/opt/xilinx/xrm/include

-L/opt/xilinx/xrt/lib -L/opt/xilinx/xvbm/lib -L/opt/xilinx/xrm/lib -lxma2api -lxma2plugin -lxvbm -lstdc++ -lxrm -lboost_system -lboost_filesystem -lboost_thread -lboost_serialization -luuid -ldl -lxrt_core

Common XMA Data Structures

struct XmaParameter

Type-Length-Value data structure used for passing custom arguments to a plugin. The declaration of XmaParameter can be found in the xmaparam.h file.

struct XmaFrameProperties

Data structure describing the frame dimensions for XmaFrame. The declaration of XmaFrameProperties can be found in the xmabuffers.h file.

struct XmaFrame

Data structure describing a raw video frame and its buffers. XmaFrame structures can be received from the decoder or sent to the encoder. They are also used as input and outputs for the scaler and the look-ahead. The declaration of XmaFrame can be found in the xmabuffers.h file.

The Xilinx Video SDK plugins supports two types of frames:

  • XMA_HOST_BUFFER_TYPE frames are explicitly allocated and managed by the host application. They are always copied from the host to the device and back after an operation.

  • XMA_DEVICE_BUFFER_TYPE frames are automatically allocated by the plugins and are implemented using the XVBM library. In multistage video pipeline, they allow for zero-copy operations where frames are passed from one hardware accelerator to the next without being copied back to the host. The frame data in the underlying XVBM buffers can be accessed by the host application using the XVBM APIs.

The decoder plugin only supports XMA_DEVICE_BUFFER_TYPE frames and the scaler, encoder and lookahead plugins support both types of frames.

NOTE: For optimal performance, Xilinx recommends that host buffers used for raw video frames be allocated on 4K boundaries. This can be done using posix_memalign() instead of malloc() when allocating the buffers. An example of this can be found in the XMA encoder application where raw frame buffers are allocated before being populated and sent to the encoder: encoder/lib/src/xlnx_enc_xma_props.c#L824.

struct XmaDataBuffer

Data structure describing a buffer containing encoded video data. XmaDataBuffer structures can be sent to the decoder or received from the encoder. The declaration of XmaDataBuffer can be found in the xmabuffers.h file.


Decoder Plugin Reference

Decoder Interface

The external interface to the decoder plugin consists of the following XMA upper-edge functions:

The declaration of these functions can be found in the xmadecoder.h file. General reference information about these functions can be found in the Decoder section of the XMA upper-edge API Library documentation. Information specific to use with the Xilinx video codec units is provided below.


XmaDecoderSession *xma_dec_session_create(XmaDecoderProperties *dec_props)

This function creates a decoder session and must be called prior to decoding data. The hardware resources required to run the session must be previously reserved using the XRM APIs and should not be released until after the session is destroyed. The number of sessions allowed depends on several factors that include: resolution, frame rate, bit depth, and the capabilities of the hardware accelerator.


int32_t xma_dec_session_send_data(XmaDecoderSession *session, XmaDataBuffer *data, int32_t *data_used)

This function sends input frame data to the hardware decoder by way of the plugin. The application needs to parse the input encoded stream and send one frame of data at a time in a XmaDataBuffer data structure.

The data_used value indicates the amount of input data consumed by the decoder.

If the function returns XMA_SUCCESS, then the decoder was able to consume the entirety of the available data and data_used will be set accordingly. In this case, the application can proceed with fetching decoded data using the xma_dec_session_recv_frame() API.

If the function returns XMA_TRY_AGAIN, then the decoder was did not consume any of the input data and data_used will be reported as 0. In this case, the application can proceed with fetching previously decoded data with the xma_dec_session_recv_frame() function but must send the same input again using using xma_dec_session_send_data() until the function returns XMA_SUCCESS.

Once the application has sent all the input frames to the decoder, it must notify the decoder by sending a null frame buffer with is_eos set to 1 in XmaDataBuffer structure. The application should then continue sending null frame buffers with is_eos set to 0 in order to flush out all the output YUV frames.


int32_t xma_dec_session_get_properties(XmaDecoderSession *dec_session, XmaFrameProperties *fprops);

This function returns the decoder properties such as width, height, output format, and bits per pixel.


int32_t xma_dec_session_recv_frame(XmaDecoderSession *session, XmaFrame *frame)

This function tries to fetch a decoded YUV frame from the hardware accelerator.

If the function returns XMA_SUCCESS, a valid YUV frame pointer is available in the buffer pointer of the XmaFrame argument. The decoder plugin only supports XmaFrame structures of the XMA_DEVICE_BUFFER_TYPE type which are implemented using XVBM buffers. The function does not copy the output frame to the host buffer, but simply provides a pointer to the output frame containing a XVBM buffer. The application must use the XVBM APIs to read, forward or release the buffer as explained in the XVBM library section.

If the function returns XMA_TRY_AGAIN, then the decoder still needs some input data to produce a complete YUV output frame.

If the function returns XMA_EOS, then the decoder has flushed out all the frames.

For an example of how to read and release a YUV output frame using the XVBM xvbm_buffer_get_host_ptr(), xvbm_buffer_read() and xvbm_buffer_pool_entry_free() APIs, refer to the decoder/app/src/xlnx_decoder_app.c file of the sample XMA decoder app.

For an example of how to receive a YUV output frame and forward it to the scaler and to the encoder plugins using the XVBM the xvbm_buffer_refcnt_inc() API, refer to the transcoder/lib/src/xlnx_transcoder.c file of the sample XMA transcoder application.


int32_t xma_dec_session_destroy(XmaDecoderSession *session)

This function destroys a decoder session that was previously created with the xma_dec_session_create() function.


Decoder Properties

The Xilinx video decoder is configured using a combination of standard XMA decoder properties and custom decoder parameters, both of which are specified using a XmaDecoderProperties data structure.

To facilitate application development, Xilinx recommends working with a simplified data structure from which the required XmaDecoderProperties can be populated using a specialized function. A reusable example of this can found in the transcoder/lib/include/xlnx_transcoder_xma_props.h and transcoder/lib/src/xlnx_transcoder_xma_props.c files of the XMA transcoder example application.


struct XmaDecoderProperties

This data structure is used to configure the Xilinx video decoder. The declaration of XmaDecoderProperties can be found in the xmadecoder.h file.


Standard XMA Decoder Properties

When using the decoder plugin, the following members of the XmaDecoderProperties data structure must be set by the application:

hwdecoder_type

Must be set to XMA_MULTI_DECODER_TYPE

hwvendor_string[MAX_VENDOR_NAME]

Vendor string used to identify specific decoder requested. Must be set to “MPSoC”

params

Array of custom initialization parameters. See the next section for the list of custom parameters supported by the decoder plugin.

param_cnt

Count of custom parameters.

width

Width in pixels of incoming video stream/data. Valid values are even integers between 128 and 3840 for H264 and HEVC. Portrait mode is supported.

height

Height in pixels of incoming video stream/data. Valid values are even integers between 128 and 2160 for H264 and HEVC.

bits_per_pixel

Bits per pixel for primary plane of output video. Must be set to 8 bits per pixel.

framerate

Framerate data structure specifying frame rate per second. Valid values can range from 1 to integer max.

plugin_lib

The plugin library name to which the application wants to communicate. The value of this property is obtained as part of XRM resource allocation.

dev_index

The device index number on which the decoder resource has been allocated. The value of this property is obtained as part of XRM resource allocation.

cu_index

The decoder coding unit(cu) index that has been allocated. The value of this property is obtained as part of XRM resource allocation.

channel_id

The channel number of the decoder that has been allocated. The value of this property is obtained as part of XRM resource allocation.

ddr_bank_index

Required property. Must be set to -1 to let the hardware determine which DDR bank should be used for this channel.

Other members of XmaDecoderProperties are not applicable to the decoder plugin and should not be used.

Custom Decoder Parameters

In addition to the standard properties, the following XmaParameter custom parameters are supported by the decoder plugin:

“bitdepth”

Bits per pixel for primary plane of output video. Valid value is 8. Should be set to the same value as the bits_per_pixel property. 10-bit support will be added in the future.

“codec_type”

Codec type. For H264, set “codec_type” to 0. For HEVC, set “codec_type” to 1.

“low_latency”

Setting this flag to 1 reduces decoding latency when splitbuff_mode is also enabled. IMPORTANT: This option should not be used with streams containing B frames. Valid values are 0 (disabled, default) and 1 (enabled)

“splitbuff_mode”

The split buffer mode hands-off buffers to next pipeline stage earlier. Setting both splitbuff_mode and low_latency to 1 reduces decoding latency. IMPORTANT: Enable this mode only if you can always send a complete Access Unit in one shot to the decoder. Valid values are 0 (disabled, default) and 1 (enabled)

“entropy_buffers_count”

Number of internal buffers to be used. Valid values are 2 to 10 and default is 2 (recommended).

“zero_copy”

When enabled, the decoder plugin returns a buffer to the outframe data instead of copying data back to host memory. This is useful in transcoder use cases where the decoder output will be used by encoder/scaler that is running on same hardware. Currently the decoder supports only zero copy, therefore this parameter must always be set to 1.

“profile”

Profile of the input stream. Supported are Baseline, Main and High for H264. Main profile for HEVC.

“level”

Level of the input stream. Supported are from 1.0 to 5.1.

“chroma_mode”

Chroma mode with which the input has been encoded. Supported mode is 420.

“scan_type”

Scan type denotes field order. Currently decoder supports only progressive and should be set to 1.

“latency_logging”

Set to 1 to enable logging of latency information to syslog.


Scaler Plugin Reference

Scaler Interface

The external interface to the scaler plugin consists of the following XMA application-level functions:

The declaration of these functions can be found in the xmascaler.h file. General reference information about these functions can be found in the Scaler section of the XMA upper-edge API library documentation. Information specific to use with the Xilinx video codec units is provided below.


XmaScalerSession *xma_scaler_session_create(XmaScalerProperties *props)

This function creates scaler session and must be called prior to sending input frames. The hardware resources required to run the session must be previously reserved using the XRM APIs and should not be released until after the session is destroyed. The number of sessions allowed depends on several factors that include: resolution, frame rate, bit depth, and the capabilities of the hardware accelerator.


int32_t xma_scaler_session_send_frame(XmaScalerSession *session, XmaFrame *frame)

This function sends a YUV frame to the underlying XMA plugin( and eventually to hardware to scale the input frame to one or multiple resolutions. The application has to read one YUV frame data in semi-planar format at a time and update the buffer details in XmaFrame argument.

The application can take further action depending upon the return value from this API.

If the function returns XMA_SUCCESS, then the application can proceed to fetch scaled output frames.

If the function returns XMA_SEND_MORE_DATA, then the application should proceed with sending next yuv frame.

If the function returns XMA_FLUSH_AGAIN it means that the application should keep flushing the scaler.

If the function returns XMA_EOS then the scaler has been flushed and the pipeline can be exited.

Once the application has sent all the input frames to the scaler, it must notify the scaler by sending a null frame buffer in the XmaFrame structure and flush the scaler. To flush the scaler, send a null frame buffer and call the xma_scaler_session_recv_frame_list() function for as long as xma_scaler_session_send_frame() returns XMA_SUCCESS or XMA_FLUSH_AGAIN.


int32_t xma_scaler_session_recv_frame_list(XmaScalerSession *session, XmaFrame **frame_list)

This function is called after calling the xma_scaler_session_send_frame(). This function returns a list of output frames with every call until it reaches end of scaling. Return codes can only be XMA_SUCCESS and XMA_ERROR.

The scaler plugin supports both XMA_HOST_BUFFER_TYPE and XMA_DEVICE_BUFFER_TYPE output buffers. The application indicates the buffer type through the XmaFrameProperties of the XmaFrame specified in the frame list.

When using XMA_HOST_BUFFER_TYPE buffers, the application is responsible for allocating the host memory for each frame. An example of how to do this can be found in the scaler/lib/src/xlnx_scal_utils.c#L55 file of the sample XMA scaler app.

When using XMA_DEVICE_BUFFER_TYPE buffers, the scaler plugin takes care of allocating XVBM buffers. The application can then access the buffer, release it the plugin or transfer it to another plugin using the XVBM APIs, as explained in the XVBM library section. An example of a scaler session using XMA_DEVICE_BUFFER_TYPE buffers can be found in the transcoder/lib/src/xlnx_scaler.c#L203 file of the sample XMA transcoder app.


int32_t xma_scaler_session_destroy(XmaScalerSession *session)

This function destroys scaler session that was previously created with the xma_scaler_session_create() function.


Scaler Properties

The Xilinx scaler is configured using a combination of standard XMA scaler properties, standard XMA scaler input and ouput properties and custom scaler parameters, all of which are specified using XmaScalerFilterProperties and XmaScalerInOutProperties data structures.

To facilitate application development, Xilinx recommends working with a simplified data structure from which the required XmaScalerFilterProperties and XmaScalerInOutProperties can be populated using a specialized function. A reusable example of this can found in the transcoder/lib/include/xlnx_transcoder_xma_props.h and transcoder/lib/src/xlnx_transcoder_xma_props.c files of the XMA transcoder example application.


struct XmaScalerFilterProperties

This data structure is used to configure the Xilinx scaler. The declaration of XmaScalerFilterProperties can be found in the xmascaler.h file.

struct XmaScalerInOutProperties

This data structure is used to configure the input and outputs of the video scaler. The XmaScalerFilterProperties data structure contains one XmaScalerInOutProperties for the scaler input and an array of 8 XmaScalerInOutProperties for the scaler outputs. The declaration of XmaScalerInOutProperties can be found in the xmascaler.h file.


Standard XMA Scaler Properties

When using the scaler plugin, the following members of the XmaScalerFilterProperties data structure must be set by the application:

hwencoder_type

Vendor value used to identify the scaler type. Must be set to XMA_POLYPHASE_SCALER_TYPE.

hwvendor_string[MAX_VENDOR_NAME]

Vendor string used to identify specific scaler requested. Must be set to “Xilinx”

num_outputs

Number of scaler outputs.

params

Array of custom initialization parameters. See the next section for the list of custom parameters supported by the scaler plugin.

param_cnt

Count of custom parameters.

plugin_lib

The plugin library name to which the application wants to communicate. The value of this property is obtained as part of XRM resource allocation.

dev_index

The device index number on which the scaler resource has been allocated. The value of this property is obtained as part of XRM resource allocation.

cu_index

The scaler coding unit(cu) index that has been allocated. The value of this property is obtained as part of XRM resource allocation.

channel_id

The channel number of the scaler that has been allocated. The value of this property is obtained as part of XRM resource allocation.

ddr_bank_index

Must be set to -1 to let the hardware determine which DDR bank should be used for this channel.

Other members of XmaScalerFilterProperties are not applicable to the scaler plugin and should not be used.

XMA Scaler Input and Output Properties

When configuring the scaler input and outputs, the following members of the XmaScalerInOutProperties data structure must be set by the application:

format

Input video format. Must be set to XMA_VCU_NV12_FMT_TYPE

width

Width in pixels of incoming video stream/data. Valid values are integers between 128 and 3840, in multiples of 4. Portrait mode is supported.

height

Height in pixels of incoming video stream/data. Valid values are even integers between 128 and 2160, in multiples of 4.

stride

Stride must be set as input width aligned by 256.

framerate

Framerate data structure specifying frame rate per second of the input stream. To specify a lower output frame rate, refer to the Mix-Rate Support section below. This value is also used by the plugin to calculate the scaler load which determines how many hardware resources to allocate. Leaving the framerate undefined could lead to undefined behavior.

Other members of XmaScalerInOutProperties are not applicable to the scaler plugin and should not be used.

Custom Scaler Parameters

In addition to the standard properties, the following XmaParameter custom parameters are supported by the scaler plugin:

“enable_pipeline”

Enable/Disable pipeline in scaler. Enabling pipeline increases the scaler speed.

“logLevel”

Enables XMA logging in scaler module. Supported values are 0 to 3.

“MixRate”

This parameter is used to configure mix-rate sessions where some scaler outputs are configured at the input frame rate and some other outputs will be configured at half the rate. For single-rate scaling, this parameter must be set to null. For mix-rate scaling, the application will need to create two different scaler sessions. The MixRate parameter of the first session must be set to null, and the MixRate parameter of the second session must be set to the address of the first session. See section below for more details on how to set-up mix-rate support in the scaler.

“latency_logging”

Set to 1 to enable logging of latency information to syslog. Set to 0 to disable logging.

Mix-Rate Support with the Scaler Plugin

The application can configure the scaler to work at mixed rate, where some output channels will be produced at the full input frame rate and some output channels will be produced at half the input frame rate.

Mix-rate is achieved by creating two different scaler sessions. One for full rate and the one for full and half rate (all rate) outputs.

Enabling mixed rate outputs requires that the following conditions be met:

  1. The first output channel must be full rate

  2. The full rate channels should be specified at the beginning followed by half rate channels, i.e., no full rate channel to be specified after half rate during session creation. This simplifies output buffer handling.

Steps to implement full rate and half rate in application:

  1. Create two scaler sessions, one for full rate channels and the other for full rate and half rate (all rate) channels.

  2. Set the full rate session fps to half, since the full rate outputs will be received from both the sessions.

  3. When creating the second session, use the address of the first session as value of the “MixRate” custom parameter. Based on this, the scaler plugin allocates more output buffers.

  4. Call scaler send and receive with full rate and all rate sessions alternatively.

For an example of how to implement mix-rate scaling, refer to the transcoder/lib/src/xlnx_scaler.c#L244 file in the sample XMA scaler application.


Encoder Plugin Reference

Encoder Interface

The external interface to the encoder plugin consists of the following XMA application-level functions:

The declaration of these functions can be found in the xmaencoder.h file. The API reference for these functions can be found in the Encoder section of the XMA upper-edge API Library documentation. Information specific to use with the Xilinx video codec units is provided below.


XmaEncoderSession *xma_enc_session_create(XmaEncoderProperties *enc_props)

This function creates an encoder session and must be called prior to encoding input YUV. The hardware resources required to run the session must be previously reserved using the XRM APIs and should not be released until after the session is destroyed. The number of sessions allowed depends on several factors that include: resolution, frame rate, bit depth, and the capabilities of the hardware accelerator.


int32_t xma_enc_session_send_frame(XmaEncoderSession *session, XmaFrame *frame)

This function sends a YUV frame to the hardware encoder by way of the plugin.

Each time the application calls this function, it must provide valid pointer to a XmaFrame structure containing a YUV frame in semi-planar format (XmaBufferRef) and information about this frame (XmaFrameProperties).

If the function returns XMA_SUCCESS, then the application can proceed to fetch the encoded data using the xma_enc_session_recv_data() API.

If the function returns XMA_SEND_MORE_DATA, then the application must send the next YUV frame before calling xma_enc_session_recv_data().

Once the application has sent all the input frames to the encoder, it should notify the hardware by sending a null frame buffer and set is_last_frame to 1 in the XmaFrame structure. If the API returns XMA_FLUSH_AGAIN after a null frame is sent, then the application can call xma_enc_session_recv_data() but must send a null frame again. Once the null frame is sent successfully, does not need to send frames anymore and can simply call xma_enc_session_recv_data() to flush out all the remaining output frames.


int32_t xma_enc_session_recv_data(XmaEncoderSession *session, XmaDataBuffer *data, int32_t *data_size)

This function is called after calling the function xma_enc_session_send_frame(). The application is the owner of the XmaDataBuffer. It is responsible for allocating it and for releasing it when done.

If the function returns XMA_SUCCESS and if data_size is greater than 0, then a valid output frame is available. The returned data (XmaDataBuffer.data) is valid until the next call to the xma_enc_session_send_frame(), so the application must use or copy it before calling xma_enc_session_send_frame() again. The XMA encoder plugin is responsible for setting the fields of the XmaDataBuffer struct. That is, XmaDataBuffer.data is set by the XMA plugin and does not transfer the ownership of this buffer to the application. The application must not attempt to free XmaDataBuffer.data. The encoder plugin will recycle the data buffers in the next call to the xma_enc_session_send_frame() function.

If the function returns XMA_TRY_AGAIN, a data buffer is not ready to be returned and the length of the data buffer is set to 0.

If the function returns XMA_EOS, the encoder has flushed all the output frames.

NOTE: In version 2.0 of the Xilinx Video SDK, this function has been updated and made thread-safe. In earlier versions, the XmaDataBuffer was allocated by the plugin and the xma_enc_session_send_frame() and xma_enc_session_recv_data() functions had to be called in a serial manner by the application layer. Starting with version 2.0 of the Xilinx Video SDK, the application is responsible for allocating the XmaDataBuffer and the xma_enc_session_send_frame() and xma_enc_session_recv_data() functions can be called from different threads.


int32_t xma_enc_session_destroy(XmaEncoderSession *session)

This function destroys an encoder session that was previously created with the xma_enc_session_create() function.


Encoder Properties

The Xilinx video encoder is configured using a combination of standard XMA encoder properties and custom encoder parameters, both of which are specified using a XmaEncoderProperties data structure.

To facilitate application development, Xilinx recommends working with a simplified data structure from which the required XmaEncoderProperties can be populated using a specialized function. A reusable example of this can found in the transcoder/lib/include/xlnx_transcoder_xma_props.h and transcoder/lib/src/xlnx_transcoder_xma_props.c files of the XMA transcoder example application.


struct XmaEncoderProperties

This data structure is used to configure the Xilinx video encoder. The declaration of XmaEncoderProperties can be found in the xmaencoder.h file.


Standard XMA Encoder Properties

When using the encoder plugin, the following members of the XmaEncoderProperties data structure must be set by the application:

hwencoder_type

Vendor value used to identify the encoder type. Must be set to XMA_MULTI_ENCODER_TYPE

hwvendor_string[MAX_VENDOR_NAME]

Vendor string used to identify hardware type. Must be set to “MPSoC”

format

Input video format. Must be set to XMA_VCU_NV12_FMT_TYPE

bits_per_pixel

Bits per pixel for primary plane of input video. Must be set to 8 bits per pixel.

width

Width in pixels of incoming video stream/data. Valid values are even integers between 128 and 3840. Portrait mode is supported.

height

Height in pixels of incoming video stream/data. Valid values are even integers between 128 and 2160.

framerate

Framerate data structure specifying frame rate per second

lookahead_depth

The lookahead module depth to give start giving lookahead data. Supported values are 0 to 20.

rc_mode

Rate control mode for custom rate control Supported values are 0 (custom rate control disabled) and 1 (enabled)

params

Array of custom initialization parameters. See the next section for the list of custom parameters supported by the encoder plugin.

param_cnt

Count of custom parameters.

plugin_lib

The plugin library name to which the application wants to communicate. The value of this property is obtained as part of XRM resource allocation.

dev_index

The device index number on which the encoder resource has been allocated. The value of this property is obtained as part of XRM resource allocation.

cu_index

The encoder coding unit(cu) index that has been allocated. The value of this property is obtained as part of XRM resource allocation.

channel_id

The channel number of the encoder that has been allocated. The value of this property is obtained as part of XRM resource allocation.

ddr_bank_index

Must be set to -1 to let the hardware determine which DDR bank should be used for this channel.

Other members of XmaEncoderProperties are not applicable to the encoder plugin and should not be used.

Custom Encoder Parameters

In addition to the standard properties, the following XmaParameter custom parameters are supported by the encoder plugin:

“enc_options”

For the encoder, most of the parameters are specified using a stringified INI file which is then passed to the “enc_options” XmaParameter. Refer to the xlnx_enc_get_xma_props() function in the transcoder/lib/src/xlnx_transcoder_xma_props.c file for the parameters which are sent as a string.

“latency_logging”

When enabled, it logs latency information to syslog.

“enable_hw_in_buf”

This parameter notifies whether the input buffer needs to copy from host or is already present on device. If the yuv frame is already on device memory, set it to 1.

“disable_pipeline”

This parameter is required in order to enable Ultra Low Latency (ULL) mode. Note that for AVC encoding avc_lowlat must be added to enc_options, above.

“stride_align”

This parameter is considered experimental. This parameter specifies the stride alignment of encoder buffers, measured in bytes. By default, buffers are 32-byte aligned. For optimal DMA performance, the buffer stride alignment should match the input lize size alignment. The type of this custom parameter is XMA_UINT32 and the value must be a multiple of 32. The lookahead expects 32-byte aligned buffers, therefore this parameter should not be used if the lookahead is active.


Look-Ahead Plugin Reference

Look-Ahead Interface

The Look-Ahead plugin is based on the Filter XMA plugin type. The external interface to the lookahead plugin consists of the following XMA application-level functions:

The declaration of these functions can be found in the xmafilter.h file. General reference information about these functions can be found in the Filter section of the XMA upper-edge API library documentation. Information specific to use with the Xilinx video codec units is provided below.


XmaFilterSession *xma_filter_session_create(XmaFilterProperties *props)

This function creates filter session and must be called prior to sending yuv frame to lookahead filter. The hardware resources required to run the session must be previously reserved using the XRM APIs and should not be released until after the session is destroyed. The number of sessions allowed depends on several factors that include: resolution, frame rate, bit depth, and the capabilities of the hardware accelerator.


int32_t xma_filter_session_send_frame(XmaFilterSession *session, XmaFrame *frame)

This function sends YUV frame to the underlying XMA plugin(lower-edge interface) and eventually to lookahead module in hardware. The application has to read one YUV frame data in semi-planar format at a time and update the details of the buffer in the XmaFrame argument. The application can take further action depending upon the return value from this API.

If this function returns XMA_SUCCESS, then the application can proceed to fetch lookahead side data along with the output frame.

If the function returns XMA_SEND_MORE_DATA, then the application should proceed with sending next YUV frame.

If this function returns XMA_TRY_AGAIN, it means the input frame has not been consumed and needs to resend the same input frame after calling receive frame.

Once the application sends all input frames to the lookahead module, it should continue sending null framea until all the frames have been flushed out from the lookahead.


int32_t xma_filter_session_recv_frame(XmaFilterSession *session, XmaFrame *frame)

This function is called after calling the function xma_filter_session_send_frame(). If an output frame is not ready to be returned, this function returns XMA_TRY_AGAIN. This function returns XMA_SUCCESS if the output frame is available.

The lookahead plugin provides the output frame and the application needs to release the frame after successfully sending it to the encoder and before calling the next xma_filter_session_send_frame().

Once the lookahead flushes all the frames, it returns XMA_EOS.


int32_t xma_filter_session_destroy(XmaFilterSession *session)

This function destroys the filter session that was previously created with the xma_filter_session_create() function.


Look-Ahead Properties

The Xilinx lookahead is configured using a combination of standard XMA filter properties, standard XMA filter input and output properties and custom lookahead parameters, all of which are specified using XmaFilterProperties and XmaFilterPortProperties data structures.

To facilitate application development, Xilinx recommends working with a simplified data structure from which the required XmaFilterProperties and XmaFilterPortProperties can be populated using a specialized function. A reusable example of this can found in the transcoder/lib/include/xlnx_transcoder_xma_props.h and transcoder/lib/src/xlnx_transcoder_xma_props.c files of the XMA transcoder example application.

IMPORTANT: Xilinx recommends enabling custom rate-control when using the lookahead. This is done as follows:

  • When creating the lookahead session, set the custom rate_control_mode parameter to 1 in the XmaFilterProperties

  • When creating the encoder session, set the standard rc_mode property to 1 in the XmaEncoderProperties


struct XmaFilterProperties

This data structure is used to configure the Xilinx lookahead function. The declaration of XmaFilterProperties can be found in the xmafilter.h file.

struct XmaFilterPortProperties

This data structure is used to configure the input and output of the lookahead. The XmaFilterProperties data structure contains one XmaFilterPortProperties for the lookahead input and one XmaFilterPortProperties for the lookahead output. The declaration of XmaFilterPortProperties can be found in the xmafilter.h file.


Standard XMA Lookahead Filter Properties

When using the lookahead plugin, the following members of the XmaFilterPortProperties data structure must be set by the application:

hwfilter_type

Vendor value used to identify the filter type. Must be set to XMA_2D_FILTER_TYPE.

hwvendor_string[MAX_VENDOR_NAME]

Vendor string used to identify specific filter requested. Must be set to “Xilinx”

params

Array of custom initialization parameters. See the next section for the list of custom parameters supported by the lookahead plugin.

param_cnt

Count of custom parameters.

plugin_lib

The plugin library name to which the application wants to communicate. The value of this property is obtained as part of XRM resource allocation.

dev_index

The device index number on which the lookahead resource has been allocated. The value of this property is obtained as part of XRM resource allocation.

cu_index

The lookahead coding unit(cu) index that has been allocated. The value of this property is obtained as part of XRM resource allocation.

channel_id

The channel number of the lookahead that has been allocated. The value of this property is obtained as part of XRM resource allocation.

ddr_bank_index

Required property. Must be set to -1 to let the hardware determine which DDR bank should be used for this channel.

Other members of XmaFilterProperties are not applicable to the lookahead plugin and should not be used.

Standard XMA Lookahead Input Filter Properties

When configuring the lookahead input, the following members of the XmaFilterPortProperties data structure must be set by the application:

format

Input video format. Must be set to XMA_VCU_NV12_FMT_TYPE.

bits_per_pixel

Bits per pixel for primary plane of input video. Must be set to 8 bits per pixel.

width

Width in pixels of incoming video stream/data. Valid values are even integers between 128 and 1920 Portrait mode is supported.

height

Height in pixels of incoming video stream/data. Valid values are even integers between 128 and 1080.

stride

Stride value should be width with 256 alignment.

framerate

Framerate data structure specifying frame rate per second.

Other members of XmaFilterPortProperties are not applicable to the lookahead input and should not be used.

Standard XMA Lookahead Output Filter Properties

When configuring the lookahead output, the following members of the XmaFilterPortProperties data structure must be set by the application:

format

Input video format. Must be set to XMA_VCU_NV12_FMT_TYPE.

bits_per_pixel

Bits per pixel for primary plane of input video. Supported is 8 bits per pixel.

width

Output width in pixels for output video frame. The value should be input width aligned by 64 and shift right by 4

height

Output height in pixels for output video frame. The value should be input height aligned by 64 and shift right by 4

framerate

Framerate data structure specifying frame rate per second.

Other members of XmaFilterPortProperties are not applicable to the lookahead output and should not be used.

Custom Lookahead Parameters

In addition to the standard properties, the following XmaParameter custom parameters are supported by the lookahead plugin:

“ip”

Intra period for the video stream.

“lookahead_depth”

Lookahead depth for the module. Value range from 0 to 20.

“enable_hw_in_buf”

This param notifies whether the input buffer needs to copy from host or is already present on device. Set it to 1, if the yuv frame is already on device memory.

“spatial_aq_mode”

Enable/Disable spatial aq mode.

“temporal_aq_mode”

Enable/Disable temporal aq mode.

“rate_control_mode”

Enable/Disable custom rate control mode.

“spatial_aq_gain”

Spatial aq gain ranges between 0 to 100, default is 50

“num_b_frames”

Number of B frames in a sub GOP. Value range from 0 to Integer max.

“codec_type”

For H264 encoder, set codec type as 0 and for HEVC encoder, set it as 1.

“latency_logging”

Set to 1 to enable logging of latency information to syslog. Set to 0 to disable logging.

“dynamic_gop”

This parameter enables content adaptive B frame insertion.


XVBM API Reference

The Xilinx Video Buffer Management (XVBM) library is used by the Xilinx Video SDK plugins to manage pools of video buffers. The XVBM API must be used to interact with the XVBM buffers associated with XmaFrame frames of the XMA_DEVICE_BUFFER_TYPE type.

The XMA_DEVICE_BUFFER_TYPE frames and their XVBM buffers can either be directly passed to other hardware accelerators without being copied back to the host (zero-copy operation in a multistage pipeline) or copied back to the host for further processing in software.

If the application needs to access the content of a XVBM buffer, it must do so using the XVBM xvbm_buffer_get_host_ptr() and xvbm_buffer_read() APIs.

If a XVBM buffer is transferred to more than one other XMA plugin session, the xvbm_buffer_refcnt_inc() API should be used to split the buffer instead of explicitly creating copies of that buffer. For an example of this, refer to the xlnx_tran_xvbm_buf_inc() function in transcoder/lib/src/xlnx_transcoder.c#L45.

If a XVBM buffer is not transferred to another plugin, then the application must release the buffer with the xvbm_buffer_pool_entry_free() API. This releases the buffer back to the plugin, allowing the plugin to reuse the buffer for a subsequent frame. Typically, if all the buffers managed by a plugin are used (not freed), then the plugin won’t be able to accept new data.

XVBM buffers are directly managed by the Xilinx Video SDK plugins. The user application can read and release XVBM buffers, but it should not create or destroy XVBM buffers.


Functions for reading and writing buffers

int32_t xvbm_buffer_write(XvbmBufferHandle b_handle, const void src, size_t size, size_t offset)

Write a buffer to device memory. Returns 0 on success.


int32_t xvbm_buffer_read(XvbmBufferHandle b_handle, void dst, size_t size, size_t offset)

Read a buffer from device memory. Returns 0 on success.


XRM API Reference

Note

Version 2.0 of the Xilinx Video SDK uses XRM 2021.2 which provides new resource management APIs. These new APIs are identified with the “V2” suffix. They provide more flexible resource allocation capabilities and Xilinx recommends using these “V2” APIs for any new development and integration work. The original XRM APIs are still supported. Applications developped using the original APIs do not require any modification.

The Xilinx® FPGA Resource Manager (XRM) library is used to manage the hardware accelerators available in the system. XRM keeps track of total system capacity for each of the compute units such as the decoder, scaler, and encoder.

The XRM library includes a daemon, a command line tool and a C application programming interface (API). Using the library API, external applications can communicate with the XRM daemon and perform actions such as reserving, allocating and releasing resources; calculating resource load and max capacity.

More details on the XRM command line tool (xrmadm) and the XRM daemon (xrmd) can be found in the XRM Reference Guide section of the documentation.

The XRM C APIs are defined in xrm.h. The detailed description of these APIs can be found in the XRM documentation. The XRM APIs listed below are the most commonly used to manage video acceleration resources:

Applications integrating the Xilinx Video SDK plugins (such as the 4 example XMA Apps included in this repository) use the XRM APIs for two kinds of tasks:

  • Resource reservation (optional)

  • Resource allocation (required)

Resource Reservation with XRM

The XRM library can be optionally be used to identify a device with enough available resources to run the desired job and reserve the corresponding resources. Doing involves three steps:

  1. Calculate the channel load based on the job properties

  2. Using the xrmCheckCuPoolAvailableNumV2() function, query XRM for the number of resources available based on the channel load. XRM checks all the devices available on the server and returns how many such channels can be accommodated.

  3. Using the xrmCuPoolReserveV2() function, reserve the resources required for this channel load. XRM returns a reservation index.

For an example of how to perform resource reservation using XRM APIs, refer the to the source code of the job slot reservation application file. The job slot reservation tool reserves the maximum number of slots for a given job. This code can be adapted to reserve a single slot on one the device with enough resources for the job of interest.

Once resources have been reserved, it is also possible to use the xrmReservationQueryV2() API to obtain the ID of the device on which the resource has been allocated and the name of the xclbin. The device ID and xclbin information can then be used to initialize the XMA session.

Resources reserved with xrmCuPoolReserveV2() must be relinquished with the xrmCuPoolRelinquishV2() function once the application no longer needs them.

Resource Allocation with XRM

In order to create an XMA plugin session (encoder/decoder/scaler/lookahead), the necessary compute unit (CU) resources must first be successfully allocated with XRM using the xrmCuAllocV2() function (or xrmCuListAllocV2() to reserve multiple CUs at once).

Resources allocated with xrmCuAllocV2() (or xrmCuListAllocV2()) must be released with the xrmCuReleaseV2() function (or xrmCuListReleaseV2()) once the application no longer needs them.

The resource allocation procedure is different depending on whether resources were previously reserved or not.

Allocation of Pre-Reserved Resources

If resources were previously reserved using the xrmCuPoolReserveV2() function, the application should perform CU allocation using the device ID and the reservation ID obtained during the resource reservation process. In this case, CU allocation will not fail as it the necessary resources have already been reserved.

  1. Create a xrmCuPoolReserveV2 data structure

  2. Assign the reservation ID to the poolID field of the xrmCuPoolReserveV2 data structure

  3. If resources were reserved across multiple devices, assign the device ID of these specific resources to the deviceInfo field of the xrmCuPoolReserveV2 data structure

  4. Allocate the resources using the xrmCuAllocV2() function

Allocation of Non-Reserved Resources

If resources were not previously reserved using the xrmCuPoolReserveV2() function, the application should first calculate the load of the current job and then attempt CU allocation for that particular load in a user-specified device. CU allocation will fail if there are not enough resources to support the specific channel load on that device.

  1. Calculate the channel load based on the job properties

  2. Create a xrmCuPoolReserveV2 data structure

  3. Assign the resource load to the requestLoad field of the xrmCuPoolReserveV2 data structure

  4. Assign the user-specified device ID to the deviceInfo field of the xrmCuPoolReserveV2 data structure

  5. Allocate the resources using the xrmCuAllocV2() function

For a detailled example of how allocate non-reserved resources, refer to two following functions from the XMA sample applications:

The xlnx_xrm_load_calc() function calculates the resource load for the given job, and the xlnx_xrm_cu_alloc() function allocates the necessary resources in a specific device to support the given load.