C API Programming Guide¶
Table of Contents
Overview¶
The Xilinx Video SDK provides a C-based application programming interface (API) which facilitates the integration of Xilinx transcoding capabilities in proprietary frameworks. This API is provided in the form of plugins leveraging the Xilinx Media Accelerator (XMA) library and the Xilinx Resource Manager (XRM) library.
The XMA Library
The XMA library (libxmaapi) is meant to simplify the development of applications managing and controlling video accelerators such as decoders, scalers, filters, and encoders. The libxmaapi is comprised of two API interfaces: the lower-edge interface and the upper-edge interface:
The lower-edge API is an interface intended for plugin developers responsible for implementing hardware control of specific Xilinx acceleration kernels. These plugins are specialized user space drivers that are aware of the low-level interface of the hardware accelerators.
The upper-edge API is a higher-level, generalized interface intended for application developers responsible for integrating control of Xilinx accelerators into software frameworks such as FFmpeg, GStreamer, or proprietary frameworks.
The Xilinx Video SDK includes plugins optimized for the Xilix video accelerators such as the ones found on Alveo U30 cards. A software developer integrating the hardware-accelerated features of Xilinx devices in a proprietary framework only needs to be familiar with the XMA upper-edge API and the properties of each plugin.
The XMA library is included as part of the Xilinx Runtime (XRT) library. General documentation on XMA can be found in the XRT documentation. The XMA Upper Edge API Library section of the XRT documentation provides a complete reference of the XMA upper-edge API.
The XRM Library
The XRM library is used to manage the hardware accelerators available in the system. XRM keeps track of total system capacity for each of the compute units such as the decoder, scaler, and encoder. The XRM library makes it possible to perform actions such as reserving, allocating and releasing resources; calculating resource load and max capacity.
Details about XRM can be found in general XRM documentation.
The Xilinx Video SDK Plugins
The Xilinx Video SDK provides 4 different plugins, each corresponding to a specific hardware accelerated feature of the card:
The decoder plugin
The encoder plugin
The lookahead plugin
The scaler plugin
Any combination of plugins can be used when integrating with a proprietary framework.
Sample source code and applications using the Xilinx Video SDK plugins and the XMA APIs to do video encoding, decoding, scaling and transcoding can be found in the XMA Tutorials included in this repository.
General Application Development Guide¶
Integration layers for applications using the Xilinx Video SDK are organized around the following steps:
Initialization
Resource Reservation
Session Creation
Runtime Processing
Cleanup
Initialization¶
Applications using the plugins must first create a XRM context using the xrmCreateContext()
function in order to establish a connection with the XRM daemon. The application must then initialize the XMA library using the xma_initialize()
function.
Examples of how to perform these two steps in order to initialize an application can be found in the common/src/xlnx_xrm_utils.c file which is shared by all XMA sample applications.
Further information about these APIs can be found in the online XRT and XRM documentation.
Resource Allocation¶
After the initialization step, the application must determine on which device to run and reserve the necessary hardware resources (CUs) on that device. This is done using the XRM APIs, as described in detail in the XRM API Reference Guide below.
Session Creation¶
Once the resources have been allocated, the application must create dedicated plugin sessions for each of the hardware accelerators that need to be used (decoder, scaler, encoder, lookahead).
To create a session, the application must first initialize all the required properties and parameters of the particular plugin. It must then call the corresponding session creation function. A complete reference for all the plugins is provided below.
Runtime Processing¶
The plugins provide functions to send data from the host and receive data from the device. The data is in the form of video frames (XmaFrame
) or encoded video data (XmaFrameData
), depending on the nature of the plugin. It is also possible to do zero-copy operations where frames are passed from one hardware accelerator to the next without being copied back to the host. The send and receive functions are specific to each plugin and the return code should be used to determine the next suitable action. A complete reference for all the plugins is provided below.
Cleanup¶
When the application finishes, it should destroy each plugin session using the corresponding destroy function. Doing so will free the resources on the Xilinx devices for other jobs and ensure that everything is released and cleaned-up properly.
The application should also use the xrmDestroyContext()
function to destroy the XRM session, stop the connection to the daemon and ensure all resources are properly released.
Compiling and Linking with the Xilinx Video SDK Plugins¶
The plugins can be dynamically linked to the application. The required packages to build applications are XRT, XRM and XVBM. These packages provided as part of the Xilinx Video SDK.
To provide the necessary declarations in your application, include the following headers in your source code:
#include <xrm.h>
#include <xmaplugin.h>
#include <xma.h>
#include <xvbm.h>
To compile and link your application with the plugins, add the following lines to your Makefile:
CFLAGS += $(shell pkg-config --cflags libxma2api libxma2plugin xvbm libxrm)
LDFLAGS += $(shell pkg-config --libs libxma2api libxma2plugin xvbm libxrm)
These should add the following switches to your gcc commands:
-I/opt/xilinx/xrt/include/xma2 -I/opt/xilinx/xrt/include -I/opt/xilinx/xvbm/include -I/opt/xilinx/xrm/include
-L/opt/xilinx/xrt/lib -L/opt/xilinx/xvbm/lib -L/opt/xilinx/xrm/lib -lxma2api -lxma2plugin -lxvbm -lstdc++ -lxrm -lboost_system -lboost_filesystem -lboost_thread -lboost_serialization -luuid -ldl -lxrt_core
Common XMA Data Structures¶
-
struct XmaParameter¶
Type-Length-Value data structure used for passing custom arguments to a plugin. The declaration of XmaParameter
can be found in the xmaparam.h file.
-
struct XmaFrameProperties¶
Data structure describing the frame dimensions for XmaFrame. The declaration of XmaFrameProperties
can be found in the xmabuffers.h file.
-
struct XmaFrame¶
Data structure describing a raw video frame and its buffers. XmaFrame
structures can be received from the decoder or sent to the encoder. They are also used as input and outputs for the scaler and the look-ahead. The declaration of XmaFrame
can be found in the xmabuffers.h file.
The Xilinx Video SDK plugins supports two types of frames:
XMA_HOST_BUFFER_TYPE
frames are explicitly allocated and managed by the host application. They are always copied from the host to the device and back after an operation.XMA_DEVICE_BUFFER_TYPE
frames are automatically allocated by the plugins and are implemented using the XVBM library. In multistage video pipeline, they allow for zero-copy operations where frames are passed from one hardware accelerator to the next without being copied back to the host. The frame data in the underlying XVBM buffers can be accessed by the host application using the XVBM APIs.
The decoder plugin only supports XMA_DEVICE_BUFFER_TYPE
frames and the scaler, encoder and lookahead plugins support both types of frames.
NOTE: For optimal performance, Xilinx recommends that host buffers used for raw video frames be allocated on 4K boundaries. This can be done using posix_memalign()
instead of malloc()
when allocating the buffers. An example of this can be found in the XMA encoder application where raw frame buffers are allocated before being populated and sent to the encoder: encoder/lib/src/xlnx_enc_xma_props.c#L824.
-
struct XmaDataBuffer¶
Data structure describing a buffer containing encoded video data. XmaDataBuffer
structures can be sent to the decoder or received from the encoder. The declaration of XmaDataBuffer
can be found in the xmabuffers.h file.
Decoder Plugin Reference¶
Decoder Interface¶
The external interface to the decoder plugin consists of the following XMA upper-edge functions:
The declaration of these functions can be found in the xmadecoder.h file. General reference information about these functions can be found in the Decoder section of the XMA upper-edge API Library documentation. Information specific to use with the Xilinx video codec units is provided below.
-
XmaDecoderSession *xma_dec_session_create(XmaDecoderProperties *dec_props)¶
This function creates a decoder session and must be called prior to decoding data. The hardware resources required to run the session must be previously reserved using the XRM APIs and should not be released until after the session is destroyed. The number of sessions allowed depends on several factors that include: resolution, frame rate, bit depth, and the capabilities of the hardware accelerator.
-
int32_t xma_dec_session_send_data(XmaDecoderSession *session, XmaDataBuffer *data, int32_t *data_used)¶
This function sends input frame data to the hardware decoder by way of the plugin. The application needs to parse the input encoded stream and send one frame of data at a time in a XmaDataBuffer
data structure.
The data_used
value indicates the amount of input data consumed by the decoder.
If the function returns XMA_SUCCESS
, then the decoder was able to consume the entirety of the available data and data_used
will be set accordingly. In this case, the application can proceed with fetching decoded data using the xma_dec_session_recv_frame()
API.
If the function returns XMA_TRY_AGAIN
, then the decoder was did not consume any of the input data and data_used
will be reported as 0. In this case, the application can proceed with fetching previously decoded data with the xma_dec_session_recv_frame()
function but must send the same input again using using xma_dec_session_send_data()
until the function returns XMA_SUCCESS
.
Once the application has sent all the input frames to the decoder, it must notify the decoder by sending a null frame buffer with is_eos
set to 1 in XmaDataBuffer
structure. The application should then continue sending null frame buffers with is_eos
set to 0 in order to flush out all the output YUV frames.
-
int32_t xma_dec_session_get_properties(XmaDecoderSession *dec_session, XmaFrameProperties *fprops);¶
This function returns the decoder properties such as width, height, output format, and bits per pixel.
This function tries to fetch a decoded YUV frame from the hardware accelerator.
If the function returns XMA_SUCCESS
, a valid YUV frame pointer is available in the buffer pointer of the XmaFrame
argument. The decoder plugin only supports XmaFrame
structures of the XMA_DEVICE_BUFFER_TYPE
type which are implemented using XVBM buffers. The function does not copy the output frame to the host buffer, but simply provides a pointer to the output frame containing a XVBM buffer. The application must use the XVBM APIs to read, forward or release the buffer as explained in the XVBM library section.
If the function returns XMA_TRY_AGAIN
, then the decoder still needs some input data to produce a complete YUV output frame.
If the function returns XMA_EOS
, then the decoder has flushed out all the frames.
For an example of how to read and release a YUV output frame using the XVBM xvbm_buffer_get_host_ptr()
, xvbm_buffer_read()
and xvbm_buffer_pool_entry_free()
APIs, refer to the decoder/app/src/xlnx_decoder_app.c file of the sample XMA decoder app.
For an example of how to receive a YUV output frame and forward it to the scaler and to the encoder plugins using the XVBM the xvbm_buffer_refcnt_inc()
API, refer to the transcoder/lib/src/xlnx_transcoder.c file of the sample XMA transcoder application.
-
int32_t xma_dec_session_destroy(XmaDecoderSession *session)¶
This function destroys a decoder session that was previously created with the xma_dec_session_create()
function.
Decoder Properties¶
The Xilinx video decoder is configured using a combination of standard XMA decoder properties and custom decoder parameters, both of which are specified using a XmaDecoderProperties
data structure.
To facilitate application development, Xilinx recommends working with a simplified data structure from which the required XmaDecoderProperties
can be populated using a specialized function. A reusable example of this can found in the transcoder/lib/include/xlnx_transcoder_xma_props.h and transcoder/lib/src/xlnx_transcoder_xma_props.c files of the XMA transcoder example application.
-
struct XmaDecoderProperties¶
This data structure is used to configure the Xilinx video decoder. The declaration of XmaDecoderProperties
can be found in the xmadecoder.h file.
Standard XMA Decoder Properties
When using the decoder plugin, the following members of the XmaDecoderProperties
data structure must be set by the application:
- hwdecoder_type
Must be set to
XMA_MULTI_DECODER_TYPE
- hwvendor_string[MAX_VENDOR_NAME]
Vendor string used to identify specific decoder requested. Must be set to “MPSoC”
- params
Array of custom initialization parameters. See the next section for the list of custom parameters supported by the decoder plugin.
- param_cnt
Count of custom parameters.
- width
Width in pixels of incoming video stream/data. Valid values are even integers between 128 and 3840 for H264 and HEVC. Portrait mode is supported.
- height
Height in pixels of incoming video stream/data. Valid values are even integers between 128 and 2160 for H264 and HEVC.
- bits_per_pixel
Bits per pixel for primary plane of output video. Must be set to 8 bits per pixel.
- framerate
Framerate data structure specifying frame rate per second. Valid values can range from 1 to integer max.
- plugin_lib
The plugin library name to which the application wants to communicate. The value of this property is obtained as part of XRM resource allocation.
- dev_index
The device index number on which the decoder resource has been allocated. The value of this property is obtained as part of XRM resource allocation.
- cu_index
The decoder coding unit(cu) index that has been allocated. The value of this property is obtained as part of XRM resource allocation.
- channel_id
The channel number of the decoder that has been allocated. The value of this property is obtained as part of XRM resource allocation.
- ddr_bank_index
Required property. Must be set to -1 to let the hardware determine which DDR bank should be used for this channel.
Other members of XmaDecoderProperties
are not applicable to the decoder plugin and should not be used.
Custom Decoder Parameters
In addition to the standard properties, the following XmaParameter
custom parameters are supported by the decoder plugin:
- “bitdepth”
Bits per pixel for primary plane of output video. Valid value is 8. Should be set to the same value as the bits_per_pixel property. 10-bit support will be added in the future.
- “codec_type”
Codec type. For H264, set “codec_type” to 0. For HEVC, set “codec_type” to 1.
- “low_latency”
Setting this flag to 1 reduces decoding latency when
splitbuff_mode
is also enabled. IMPORTANT: This option should not be used with streams containing B frames. Valid values are 0 (disabled, default) and 1 (enabled)- “splitbuff_mode”
The split buffer mode hands-off buffers to next pipeline stage earlier. Setting both
splitbuff_mode
andlow_latency
to 1 reduces decoding latency. IMPORTANT: Enable this mode only if you can always send a complete Access Unit in one shot to the decoder. Valid values are 0 (disabled, default) and 1 (enabled)- “entropy_buffers_count”
Number of internal buffers to be used. Valid values are 2 to 10 and default is 2 (recommended).
- “zero_copy”
When enabled, the decoder plugin returns a buffer to the outframe data instead of copying data back to host memory. This is useful in transcoder use cases where the decoder output will be used by encoder/scaler that is running on same hardware. Currently the decoder supports only zero copy, therefore this parameter must always be set to 1.
- “profile”
Profile of the input stream. Supported are Baseline, Main and High for H264. Main profile for HEVC.
- “level”
Level of the input stream. Supported are from 1.0 to 5.1.
- “chroma_mode”
Chroma mode with which the input has been encoded. Supported mode is 420.
- “scan_type”
Scan type denotes field order. Currently decoder supports only progressive and should be set to 1.
- “latency_logging”
Set to 1 to enable logging of latency information to syslog.
Scaler Plugin Reference¶
Scaler Interface¶
The external interface to the scaler plugin consists of the following XMA application-level functions:
xma_scaler_default_filter_coeff_set()
The declaration of these functions can be found in the xmascaler.h file. General reference information about these functions can be found in the Scaler section of the XMA upper-edge API library documentation. Information specific to use with the Xilinx video codec units is provided below.
-
XmaScalerSession *xma_scaler_session_create(XmaScalerProperties *props)¶
This function creates scaler session and must be called prior to sending input frames. The hardware resources required to run the session must be previously reserved using the XRM APIs and should not be released until after the session is destroyed. The number of sessions allowed depends on several factors that include: resolution, frame rate, bit depth, and the capabilities of the hardware accelerator.
This function sends a YUV frame to the underlying XMA plugin( and eventually to hardware to scale the input frame to one or multiple resolutions. The application has to read one YUV frame data in semi-planar format at a time and update the buffer details in XmaFrame argument.
The application can take further action depending upon the return value from this API.
If the function returns XMA_SUCCESS
, then the application can proceed to fetch scaled output frames.
If the function returns XMA_SEND_MORE_DATA
, then the application should proceed with sending next yuv frame.
If the function returns XMA_FLUSH_AGAIN
it means that the application should keep flushing the scaler.
If the function returns XMA_EOS
then the scaler has been flushed and the pipeline can be exited.
Once the application has sent all the input frames to the scaler, it must notify the scaler by sending a null frame buffer in the XmaFrame
structure and flush the scaler. To flush the scaler, send a null frame buffer and call the xma_scaler_session_recv_frame_list()
function for as long as xma_scaler_session_send_frame()
returns XMA_SUCCESS
or XMA_FLUSH_AGAIN
.
This function is called after calling the xma_scaler_session_send_frame()
. This function returns a list of output frames with every call until it reaches end of scaling. Return codes can only be XMA_SUCCESS
and XMA_ERROR
.
The scaler plugin supports both XMA_HOST_BUFFER_TYPE
and XMA_DEVICE_BUFFER_TYPE
output buffers. The application indicates the buffer type through the XmaFrameProperties
of the XmaFrame
specified in the frame list.
When using XMA_HOST_BUFFER_TYPE
buffers, the application is responsible for allocating the host memory for each frame. An example of how to do this can be found in the scaler/lib/src/xlnx_scal_utils.c#L55 file of the sample XMA scaler app.
When using XMA_DEVICE_BUFFER_TYPE
buffers, the scaler plugin takes care of allocating XVBM buffers. The application can then access the buffer, release it the plugin or transfer it to another plugin using the XVBM APIs, as explained in the XVBM library section. An example of a scaler session using XMA_DEVICE_BUFFER_TYPE
buffers can be found in the transcoder/lib/src/xlnx_scaler.c#L203 file of the sample XMA transcoder app.
-
int32_t xma_scaler_session_destroy(XmaScalerSession *session)¶
This function destroys scaler session that was previously created with the xma_scaler_session_create()
function.
Scaler Properties¶
The Xilinx scaler is configured using a combination of standard XMA scaler properties, standard XMA scaler input and ouput properties and custom scaler parameters, all of which are specified using XmaScalerFilterProperties
and XmaScalerInOutProperties
data structures.
To facilitate application development, Xilinx recommends working with a simplified data structure from which the required XmaScalerFilterProperties
and XmaScalerInOutProperties
can be populated using a specialized function. A reusable example of this can found in the transcoder/lib/include/xlnx_transcoder_xma_props.h and transcoder/lib/src/xlnx_transcoder_xma_props.c files of the XMA transcoder example application.
-
struct XmaScalerFilterProperties¶
This data structure is used to configure the Xilinx scaler. The declaration of XmaScalerFilterProperties
can be found in the xmascaler.h file.
-
struct XmaScalerInOutProperties¶
This data structure is used to configure the input and outputs of the video scaler. The XmaScalerFilterProperties
data structure contains one XmaScalerInOutProperties
for the scaler input and an array of 8 XmaScalerInOutProperties
for the scaler outputs. The declaration of XmaScalerInOutProperties
can be found in the xmascaler.h file.
Standard XMA Scaler Properties
When using the scaler plugin, the following members of the XmaScalerFilterProperties
data structure must be set by the application:
- hwencoder_type
Vendor value used to identify the scaler type. Must be set to
XMA_POLYPHASE_SCALER_TYPE
.- hwvendor_string[MAX_VENDOR_NAME]
Vendor string used to identify specific scaler requested. Must be set to “Xilinx”
- num_outputs
Number of scaler outputs.
- params
Array of custom initialization parameters. See the next section for the list of custom parameters supported by the scaler plugin.
- param_cnt
Count of custom parameters.
- plugin_lib
The plugin library name to which the application wants to communicate. The value of this property is obtained as part of XRM resource allocation.
- dev_index
The device index number on which the scaler resource has been allocated. The value of this property is obtained as part of XRM resource allocation.
- cu_index
The scaler coding unit(cu) index that has been allocated. The value of this property is obtained as part of XRM resource allocation.
- channel_id
The channel number of the scaler that has been allocated. The value of this property is obtained as part of XRM resource allocation.
- ddr_bank_index
Must be set to -1 to let the hardware determine which DDR bank should be used for this channel.
Other members of XmaScalerFilterProperties
are not applicable to the scaler plugin and should not be used.
XMA Scaler Input and Output Properties
When configuring the scaler input and outputs, the following members of the XmaScalerInOutProperties
data structure must be set by the application:
- format
Input video format. Must be set to
XMA_VCU_NV12_FMT_TYPE
- width
Width in pixels of incoming video stream/data. Valid values are integers between 128 and 3840, in multiples of 4. Portrait mode is supported.
- height
Height in pixels of incoming video stream/data. Valid values are even integers between 128 and 2160, in multiples of 4.
- stride
Stride must be set as input width aligned by 256.
- framerate
Framerate data structure specifying frame rate per second of the input stream. To specify a lower output frame rate, refer to the Mix-Rate Support section below. This value is also used by the plugin to calculate the scaler load which determines how many hardware resources to allocate. Leaving the framerate undefined could lead to undefined behavior.
Other members of XmaScalerInOutProperties
are not applicable to the scaler plugin and should not be used.
Custom Scaler Parameters
In addition to the standard properties, the following XmaParameter
custom parameters are supported by the scaler plugin:
- “enable_pipeline”
Enable/Disable pipeline in scaler. Enabling pipeline increases the scaler speed.
- “logLevel”
Enables XMA logging in scaler module. Supported values are 0 to 3.
- “MixRate”
This parameter is used to configure mix-rate sessions where some scaler outputs are configured at the input frame rate and some other outputs will be configured at half the rate. For single-rate scaling, this parameter must be set to null. For mix-rate scaling, the application will need to create two different scaler sessions. The MixRate parameter of the first session must be set to null, and the MixRate parameter of the second session must be set to the address of the first session. See section below for more details on how to set-up mix-rate support in the scaler.
- “latency_logging”
Set to 1 to enable logging of latency information to syslog. Set to 0 to disable logging.
Mix-Rate Support with the Scaler Plugin¶
The application can configure the scaler to work at mixed rate, where some output channels will be produced at the full input frame rate and some output channels will be produced at half the input frame rate.
Mix-rate is achieved by creating two different scaler sessions. One for full rate and the one for full and half rate (all rate) outputs.
Enabling mixed rate outputs requires that the following conditions be met:
The first output channel must be full rate
The full rate channels should be specified at the beginning followed by half rate channels, i.e., no full rate channel to be specified after half rate during session creation. This simplifies output buffer handling.
Steps to implement full rate and half rate in application:
Create two scaler sessions, one for full rate channels and the other for full rate and half rate (all rate) channels.
Set the full rate session fps to half, since the full rate outputs will be received from both the sessions.
When creating the second session, use the address of the first session as value of the “MixRate” custom parameter. Based on this, the scaler plugin allocates more output buffers.
Call scaler send and receive with full rate and all rate sessions alternatively.
For an example of how to implement mix-rate scaling, refer to the transcoder/lib/src/xlnx_scaler.c#L244 file in the sample XMA scaler application.
Encoder Plugin Reference¶
Encoder Interface¶
The external interface to the encoder plugin consists of the following XMA application-level functions:
The declaration of these functions can be found in the xmaencoder.h file. The API reference for these functions can be found in the Encoder section of the XMA upper-edge API Library documentation. Information specific to use with the Xilinx video codec units is provided below.
-
XmaEncoderSession *xma_enc_session_create(XmaEncoderProperties *enc_props)¶
This function creates an encoder session and must be called prior to encoding input YUV. The hardware resources required to run the session must be previously reserved using the XRM APIs and should not be released until after the session is destroyed. The number of sessions allowed depends on several factors that include: resolution, frame rate, bit depth, and the capabilities of the hardware accelerator.
This function sends a YUV frame to the hardware encoder by way of the plugin.
Each time the application calls this function, it must provide valid pointer to a XmaFrame
structure containing a YUV frame in semi-planar format (XmaBufferRef
) and information about this frame (XmaFrameProperties
).
If the function returns XMA_SUCCESS
, then the application can proceed to fetch the encoded data using the xma_enc_session_recv_data()
API.
If the function returns XMA_SEND_MORE_DATA
, then the application must send the next YUV frame before calling xma_enc_session_recv_data()
.
Once the application has sent all the input frames to the encoder, it should notify the hardware by sending a null frame buffer and set is_last_frame
to 1 in the XmaFrame
structure. If the API returns XMA_FLUSH_AGAIN
after a null frame is sent, then the application can call xma_enc_session_recv_data()
but must send a null frame again. Once the null frame is sent successfully, does not need to send frames anymore and can simply call xma_enc_session_recv_data()
to flush out all the remaining output frames.
-
int32_t xma_enc_session_recv_data(XmaEncoderSession *session, XmaDataBuffer *data, int32_t *data_size)¶
This function is called after calling the function xma_enc_session_send_frame()
. The application is the owner of the XmaDataBuffer
. It is responsible for allocating it and for releasing it when done.
If the function returns XMA_SUCCESS
and if data_size
is greater than 0, then a valid output frame is available. The returned data (XmaDataBuffer.data
) is valid until the next call to the xma_enc_session_send_frame()
, so the application must use or copy it before calling xma_enc_session_send_frame()
again. The XMA encoder plugin is responsible for setting the fields of the XmaDataBuffer
struct. That is, XmaDataBuffer.data
is set by the XMA plugin and does not transfer the ownership of this buffer to the application. The application must not attempt to free XmaDataBuffer.data
. The encoder plugin will recycle the data buffers in the next call to the xma_enc_session_send_frame()
function.
If the function returns XMA_TRY_AGAIN
, a data buffer is not ready to be returned and the length of the data buffer is set to 0.
If the function returns XMA_EOS
, the encoder has flushed all the output frames.
NOTE: In version 2.0 of the Xilinx Video SDK, this function has been updated and made thread-safe. In earlier versions, the XmaDataBuffer
was allocated by the plugin and the xma_enc_session_send_frame()
and xma_enc_session_recv_data()
functions had to be called in a serial manner by the application layer. Starting with version 2.0 of the Xilinx Video SDK, the application is responsible for allocating the XmaDataBuffer
and the xma_enc_session_send_frame()
and xma_enc_session_recv_data()
functions can be called from different threads.
-
int32_t xma_enc_session_destroy(XmaEncoderSession *session)¶
This function destroys an encoder session that was previously created with the xma_enc_session_create()
function.
Encoder Properties¶
The Xilinx video encoder is configured using a combination of standard XMA encoder properties and custom encoder parameters, both of which are specified using a XmaEncoderProperties
data structure.
To facilitate application development, Xilinx recommends working with a simplified data structure from which the required XmaEncoderProperties
can be populated using a specialized function. A reusable example of this can found in the transcoder/lib/include/xlnx_transcoder_xma_props.h and transcoder/lib/src/xlnx_transcoder_xma_props.c files of the XMA transcoder example application.
-
struct XmaEncoderProperties¶
This data structure is used to configure the Xilinx video encoder. The declaration of XmaEncoderProperties
can be found in the xmaencoder.h file.
Standard XMA Encoder Properties
When using the encoder plugin, the following members of the XmaEncoderProperties
data structure must be set by the application:
- hwencoder_type
Vendor value used to identify the encoder type. Must be set to
XMA_MULTI_ENCODER_TYPE
- hwvendor_string[MAX_VENDOR_NAME]
Vendor string used to identify hardware type. Must be set to “MPSoC”
- format
Input video format. Must be set to
XMA_VCU_NV12_FMT_TYPE
- bits_per_pixel
Bits per pixel for primary plane of input video. Must be set to 8 bits per pixel.
- width
Width in pixels of incoming video stream/data. Valid values are even integers between 128 and 3840. Portrait mode is supported.
- height
Height in pixels of incoming video stream/data. Valid values are even integers between 128 and 2160.
- framerate
Framerate data structure specifying frame rate per second
- lookahead_depth
The lookahead module depth to give start giving lookahead data. Supported values are 0 to 20.
- rc_mode
Rate control mode for custom rate control Supported values are 0 (custom rate control disabled) and 1 (enabled)
- params
Array of custom initialization parameters. See the next section for the list of custom parameters supported by the encoder plugin.
- param_cnt
Count of custom parameters.
- plugin_lib
The plugin library name to which the application wants to communicate. The value of this property is obtained as part of XRM resource allocation.
- dev_index
The device index number on which the encoder resource has been allocated. The value of this property is obtained as part of XRM resource allocation.
- cu_index
The encoder coding unit(cu) index that has been allocated. The value of this property is obtained as part of XRM resource allocation.
- channel_id
The channel number of the encoder that has been allocated. The value of this property is obtained as part of XRM resource allocation.
- ddr_bank_index
Must be set to -1 to let the hardware determine which DDR bank should be used for this channel.
Other members of XmaEncoderProperties
are not applicable to the encoder plugin and should not be used.
Custom Encoder Parameters
In addition to the standard properties, the following XmaParameter
custom parameters are supported by the encoder plugin:
- “enc_options”
For the encoder, most of the parameters are specified using a stringified INI file which is then passed to the “enc_options”
XmaParameter
. Refer to thexlnx_enc_get_xma_props()
function in the transcoder/lib/src/xlnx_transcoder_xma_props.c file for the parameters which are sent as a string.- “latency_logging”
When enabled, it logs latency information to syslog.
- “enable_hw_in_buf”
This parameter notifies whether the input buffer needs to copy from host or is already present on device. If the yuv frame is already on device memory, set it to 1.
- “disable_pipeline”
This parameter is required in order to enable Ultra Low Latency (ULL) mode. Note that for AVC encoding avc_lowlat must be added to enc_options, above.
- “stride_align”
This parameter is considered experimental. This parameter specifies the stride alignment of encoder buffers, measured in bytes. By default, buffers are 32-byte aligned. For optimal DMA performance, the buffer stride alignment should match the input lize size alignment. The type of this custom parameter is XMA_UINT32 and the value must be a multiple of 32. The lookahead expects 32-byte aligned buffers, therefore this parameter should not be used if the lookahead is active.
Look-Ahead Plugin Reference¶
Look-Ahead Interface¶
The Look-Ahead plugin is based on the Filter XMA plugin type. The external interface to the lookahead plugin consists of the following XMA application-level functions:
The declaration of these functions can be found in the xmafilter.h file. General reference information about these functions can be found in the Filter section of the XMA upper-edge API library documentation. Information specific to use with the Xilinx video codec units is provided below.
-
XmaFilterSession *xma_filter_session_create(XmaFilterProperties *props)¶
This function creates filter session and must be called prior to sending yuv frame to lookahead filter. The hardware resources required to run the session must be previously reserved using the XRM APIs and should not be released until after the session is destroyed. The number of sessions allowed depends on several factors that include: resolution, frame rate, bit depth, and the capabilities of the hardware accelerator.
This function sends YUV frame to the underlying XMA plugin(lower-edge interface) and eventually to lookahead module in hardware. The application has to read one YUV frame data in semi-planar format at a time and update the details of the buffer in the XmaFrame
argument.
The application can take further action depending upon the return value from this API.
If this function returns XMA_SUCCESS
, then the application can proceed to fetch lookahead side data along with the output frame.
If the function returns XMA_SEND_MORE_DATA
, then the application should proceed with sending next YUV frame.
If this function returns XMA_TRY_AGAIN
, it means the input frame has not been consumed and needs to resend the same input frame after calling receive frame.
Once the application sends all input frames to the lookahead module, it should continue sending null framea until all the frames have been flushed out from the lookahead.
This function is called after calling the function xma_filter_session_send_frame()
. If an output frame is not ready to be returned, this function returns XMA_TRY_AGAIN
. This function returns XMA_SUCCESS
if the output frame is available.
The lookahead plugin provides the output frame and the application needs to release the frame after successfully sending it to the encoder and before calling the next xma_filter_session_send_frame()
.
Once the lookahead flushes all the frames, it returns XMA_EOS
.
-
int32_t xma_filter_session_destroy(XmaFilterSession *session)¶
This function destroys the filter session that was previously created with the xma_filter_session_create()
function.
Look-Ahead Properties¶
The Xilinx lookahead is configured using a combination of standard XMA filter properties, standard XMA filter input and output properties and custom lookahead parameters, all of which are specified using XmaFilterProperties
and XmaFilterPortProperties
data structures.
To facilitate application development, Xilinx recommends working with a simplified data structure from which the required XmaFilterProperties
and XmaFilterPortProperties
can be populated using a specialized function. A reusable example of this can found in the transcoder/lib/include/xlnx_transcoder_xma_props.h and transcoder/lib/src/xlnx_transcoder_xma_props.c files of the XMA transcoder example application.
IMPORTANT: Xilinx recommends enabling custom rate-control when using the lookahead. This is done as follows:
When creating the lookahead session, set the custom
rate_control_mode
parameter to 1 in theXmaFilterProperties
When creating the encoder session, set the standard
rc_mode
property to 1 in theXmaEncoderProperties
-
struct XmaFilterProperties¶
This data structure is used to configure the Xilinx lookahead function. The declaration of XmaFilterProperties
can be found in the xmafilter.h file.
-
struct XmaFilterPortProperties¶
This data structure is used to configure the input and output of the lookahead. The XmaFilterProperties
data structure contains one XmaFilterPortProperties
for the lookahead input and one XmaFilterPortProperties
for the lookahead output. The declaration of XmaFilterPortProperties
can be found in the xmafilter.h file.
Standard XMA Lookahead Filter Properties
When using the lookahead plugin, the following members of the XmaFilterPortProperties
data structure must be set by the application:
- hwfilter_type
Vendor value used to identify the filter type. Must be set to
XMA_2D_FILTER_TYPE
.- hwvendor_string[MAX_VENDOR_NAME]
Vendor string used to identify specific filter requested. Must be set to “Xilinx”
- params
Array of custom initialization parameters. See the next section for the list of custom parameters supported by the lookahead plugin.
- param_cnt
Count of custom parameters.
- plugin_lib
The plugin library name to which the application wants to communicate. The value of this property is obtained as part of XRM resource allocation.
- dev_index
The device index number on which the lookahead resource has been allocated. The value of this property is obtained as part of XRM resource allocation.
- cu_index
The lookahead coding unit(cu) index that has been allocated. The value of this property is obtained as part of XRM resource allocation.
- channel_id
The channel number of the lookahead that has been allocated. The value of this property is obtained as part of XRM resource allocation.
- ddr_bank_index
Required property. Must be set to -1 to let the hardware determine which DDR bank should be used for this channel.
Other members of XmaFilterProperties
are not applicable to the lookahead plugin and should not be used.
Standard XMA Lookahead Input Filter Properties
When configuring the lookahead input, the following members of the XmaFilterPortProperties
data structure must be set by the application:
- format
Input video format. Must be set to
XMA_VCU_NV12_FMT_TYPE
.- bits_per_pixel
Bits per pixel for primary plane of input video. Must be set to 8 bits per pixel.
- width
Width in pixels of incoming video stream/data. Valid values are even integers between 128 and 1920 Portrait mode is supported.
- height
Height in pixels of incoming video stream/data. Valid values are even integers between 128 and 1080.
- stride
Stride value should be width with 256 alignment.
- framerate
Framerate data structure specifying frame rate per second.
Other members of XmaFilterPortProperties
are not applicable to the lookahead input and should not be used.
Standard XMA Lookahead Output Filter Properties
When configuring the lookahead output, the following members of the XmaFilterPortProperties
data structure must be set by the application:
- format
Input video format. Must be set to
XMA_VCU_NV12_FMT_TYPE
.- bits_per_pixel
Bits per pixel for primary plane of input video. Supported is 8 bits per pixel.
- width
Output width in pixels for output video frame. The value should be input width aligned by 64 and shift right by 4
- height
Output height in pixels for output video frame. The value should be input height aligned by 64 and shift right by 4
- framerate
Framerate data structure specifying frame rate per second.
Other members of XmaFilterPortProperties
are not applicable to the lookahead output and should not be used.
Custom Lookahead Parameters
In addition to the standard properties, the following XmaParameter
custom parameters are supported by the lookahead plugin:
- “ip”
Intra period for the video stream.
- “lookahead_depth”
Lookahead depth for the module. Value range from 0 to 20.
- “enable_hw_in_buf”
This param notifies whether the input buffer needs to copy from host or is already present on device. Set it to 1, if the yuv frame is already on device memory.
- “spatial_aq_mode”
Enable/Disable spatial aq mode.
- “temporal_aq_mode”
Enable/Disable temporal aq mode.
- “rate_control_mode”
Enable/Disable custom rate control mode.
- “spatial_aq_gain”
Spatial aq gain ranges between 0 to 100, default is 50
- “num_b_frames”
Number of B frames in a sub GOP. Value range from 0 to Integer max.
- “codec_type”
For H264 encoder, set codec type as 0 and for HEVC encoder, set it as 1.
- “latency_logging”
Set to 1 to enable logging of latency information to syslog. Set to 0 to disable logging.
- “dynamic_gop”
This parameter enables content adaptive B frame insertion.
XVBM API Reference¶
The Xilinx Video Buffer Management (XVBM) library is used by the Xilinx Video SDK plugins to manage pools of video buffers. The XVBM API must be used to interact with the XVBM buffers associated with XmaFrame
frames of the XMA_DEVICE_BUFFER_TYPE
type.
The XMA_DEVICE_BUFFER_TYPE
frames and their XVBM buffers can either be directly passed to other hardware accelerators without being copied back to the host (zero-copy operation in a multistage pipeline) or copied back to the host for further processing in software.
If the application needs to access the content of a XVBM buffer, it must do so using the XVBM xvbm_buffer_get_host_ptr()
and xvbm_buffer_read()
APIs.
If a XVBM buffer is transferred to more than one other XMA plugin session, the xvbm_buffer_refcnt_inc()
API should be used to split the buffer instead of explicitly creating copies of that buffer. For an example of this, refer to the xlnx_tran_xvbm_buf_inc()
function in transcoder/lib/src/xlnx_transcoder.c#L45.
If a XVBM buffer is not transferred to another plugin, then the application must release the buffer with the xvbm_buffer_pool_entry_free()
API. This releases the buffer back to the plugin, allowing the plugin to reuse the buffer for a subsequent frame. Typically, if all the buffers managed by a plugin are used (not freed), then the plugin won’t be able to accept new data.
XVBM buffers are directly managed by the Xilinx Video SDK plugins. The user application can read and release XVBM buffers, but it should not create or destroy XVBM buffers.
Functions for reading and writing buffers¶
-
int32_t xvbm_buffer_write(XvbmBufferHandle b_handle, const void src, size_t size, size_t offset)¶
Write a buffer to device memory. Returns 0 on success.
-
int32_t xvbm_buffer_read(XvbmBufferHandle b_handle, void dst, size_t size, size_t offset)¶
Read a buffer from device memory. Returns 0 on success.
XRM API Reference¶
Note
Version 2.0 of the Xilinx Video SDK uses XRM 2021.2 which provides new resource management APIs. These new APIs are identified with the “V2” suffix. They provide more flexible resource allocation capabilities and Xilinx recommends using these “V2” APIs for any new development and integration work. The original XRM APIs are still supported. Applications developped using the original APIs do not require any modification.
The Xilinx® FPGA Resource Manager (XRM) library is used to manage the hardware accelerators available in the system. XRM keeps track of total system capacity for each of the compute units such as the decoder, scaler, and encoder.
The XRM library includes a daemon, a command line tool and a C application programming interface (API). Using the library API, external applications can communicate with the XRM daemon and perform actions such as reserving, allocating and releasing resources; calculating resource load and max capacity.
More details on the XRM command line tool (xrmadm) and the XRM daemon (xrmd) can be found in the XRM Reference Guide section of the documentation.
The XRM C APIs are defined in xrm.h
. The detailed description of these APIs can be found in the XRM documentation. The XRM APIs listed below are the most commonly used to manage video acceleration resources:
Applications integrating the Xilinx Video SDK plugins (such as the 4 example XMA Apps included in this repository) use the XRM APIs for two kinds of tasks:
Resource reservation (optional)
Resource allocation (required)
Resource Reservation with XRM¶
The XRM library can be optionally be used to identify a device with enough available resources to run the desired job and reserve the corresponding resources. Doing involves three steps:
Calculate the channel load based on the job properties
Using the
xrmCheckCuPoolAvailableNumV2()
function, query XRM for the number of resources available based on the channel load. XRM checks all the devices available on the server and returns how many such channels can be accommodated.Using the
xrmCuPoolReserveV2()
function, reserve the resources required for this channel load. XRM returns a reservation index.
For an example of how to perform resource reservation using XRM APIs, refer the to the source code of the job slot reservation application file. The job slot reservation tool reserves the maximum number of slots for a given job. This code can be adapted to reserve a single slot on one the device with enough resources for the job of interest.
Once resources have been reserved, it is also possible to use the xrmReservationQueryV2()
API to obtain the ID of the device on which the resource has been allocated and the name of the xclbin. The device ID and xclbin information can then be used to initialize the XMA session.
Resources reserved with xrmCuPoolReserveV2()
must be relinquished with the xrmCuPoolRelinquishV2()
function once the application no longer needs them.
Resource Allocation with XRM¶
In order to create an XMA plugin session (encoder/decoder/scaler/lookahead), the necessary compute unit (CU) resources must first be successfully allocated with XRM using the xrmCuAllocV2()
function (or xrmCuListAllocV2()
to reserve multiple CUs at once).
Resources allocated with xrmCuAllocV2()
(or xrmCuListAllocV2()
) must be released with the xrmCuReleaseV2()
function (or xrmCuListReleaseV2()
) once the application no longer needs them.
The resource allocation procedure is different depending on whether resources were previously reserved or not.
Allocation of Pre-Reserved Resources¶
If resources were previously reserved using the xrmCuPoolReserveV2()
function, the application should perform CU allocation using the device ID and the reservation ID obtained during the resource reservation process. In this case, CU allocation will not fail as it the necessary resources have already been reserved.
Create a
xrmCuPoolReserveV2
data structureAssign the reservation ID to the
poolID
field of thexrmCuPoolReserveV2
data structureIf resources were reserved across multiple devices, assign the device ID of these specific resources to the
deviceInfo
field of thexrmCuPoolReserveV2
data structureAllocate the resources using the
xrmCuAllocV2()
function
Allocation of Non-Reserved Resources¶
If resources were not previously reserved using the xrmCuPoolReserveV2()
function, the application should first calculate the load of the current job and then attempt CU allocation for that particular load in a user-specified device. CU allocation will fail if there are not enough resources to support the specific channel load on that device.
Calculate the channel load based on the job properties
Create a
xrmCuPoolReserveV2
data structureAssign the resource load to the
requestLoad
field of thexrmCuPoolReserveV2
data structureAssign the user-specified device ID to the
deviceInfo
field of thexrmCuPoolReserveV2
data structureAllocate the resources using the
xrmCuAllocV2()
function
For a detailled example of how allocate non-reserved resources, refer to two following functions from the XMA sample applications:
xlnx_xrm_load_calc()
function in common/src/xlnx_xrm_utils.c#L108xlnx_xrm_cu_alloc()
function in common/src/xlnx_xrm_utils.c#L240
The xlnx_xrm_load_calc()
function calculates the resource load for the given job, and the xlnx_xrm_cu_alloc()
function allocates the necessary resources in a specific device to support the given load.