axicdma
Vitis Drivers API Documentation
axicdma Documentation

This is the driver API for the AXI CDMA engine. For a full description of the features of the AXI CDMA engine, please refer to the hardware specification. This driver supports the following features:

  • Simple DMA transfer
  • Scatter gather (SG) DMA transfer
  • Interrupt for error or completion of transfers
  • For SG DMA transfer:
    • Programmable interrupt coalescing
    • Programmable delay timer counter
    • Managing the buffer descriptors (BDs)

Two Hardware Building Modes

The hardware can be built in two modes:

  • Simple only mode, in this mode, only simple transfers are supported by the hardware. The functionality is similar to the XPS Central DMA, however, the driver API to do the transfer is slightly different.
  • Hybrid mode, in this mode, the hardware supports both the simple transfer and the SG transfer. However, only one kind of transfer can be active at a time. If an SG transfer is ongoing in the hardware, a submission of a simple transfer fails. If a simple transfer is ongoing in the hardware, a submission of an SG transfer is successful, however the SG transfer will not start until the simple transfer is done.

Transactions

The hardware supports two types of transfers, the simple DMA transfer and the scatter gather (SG) DMA transfer.

A simple DMA transfer only needs source buffer address, destination buffer address and transfer length to do a DMA transfer. Only one transfer can be submitted to the hardware at a time.

A SG DMA transfer requires setting up a buffer descriptor (BD), which keeps the transfer information, including source buffer address, destination buffer address, and transfer length. The hardware updates the BD for the completion status of the transfer. BDs that are connected to each other can be submitted to the hardware at once, therefore, the SG DMA transfer has better performance when the application is doing multiple transfers each time.

Callback Function

Each transfer, for which the application cares about its completion, should provide with the driver its callback function. The signature of the callback function is as the following:

void XAxiCdma_CallBackFn(void *CallBackRef, u32 IrqMask, int *NumPtr);

Where the CallBackRef is a reference pointer that the application passes to the driver along with the callback function. The driver passes IrqMask to the application when it calls this callback. The NumPtr is only used in SG mode to track how many BDs still left for this callback function.

The callback function is set upon transfer submission:

  • Simple transfer callback function setup:

    Only set the callback function if in interrupt mode.

    For simple transfers, the callback function along with the callback reference pointer is passed to the driver through the submission of the simple transfer:

     XAxiCdma_SimpleTransfer(...)
    
  • SG transfer callback function setup: For SG transfers, the callback function and the callback reference pointer are set through the transfer submission call:
     XAxiCdma_BdRingToHw(...)
    

Simple Transfers

For an application that only does one DMA transfer at a time, and the DMA engine is exclusively used by this application, simple DMA transfer is sufficient.

Using the simple DMA transfer has the advantage of ease of use comparing to SG DMA transfer. For an individual DMA transfer, simple DMA transfer is also faster because of simplicity in software and hardware.

Scatter Gather (SG) Transfers

For an application that has multiple DMA transfers sometimes, or the DMA engine is shared by multiple applications, using SG DMA transfer yields better performance over all applications.

The SG DMA transfer provides queuing of multiple transfers, therefore, it provides better performance because the hardware can continuously work on all submitted transfers without software intervention.

The down side of using the SG DMA transfer is that you have to manage the memory for the buffer descriptors (BD), and setup BDs for the transfers.

Interrupts

The driver handles the interrupts.

The completion of a transfer, that has a callback function associated with, will trigger the driver to call the callback function. The IrqMask that is passed through the callback function notifies the application about the completion status of the transfer.

Interrupt Coalescing for SG Transfers

For SG transfers, the application can program the interrupt coalescing threshold to reduce the frequency of interrupts. If the number of transfers does not match well with the interrupt coalescing threshold, the completion of the last transfer will not trigger the completion interrupt. However, after the specified delay count time, the delay interrupt will fire.

By default, the interrupt threshold for the hardware is one, which is one interrupt per BD completion.

Delay Interrupt for SG Transfers

Delay interrupt is to signal the application about inactivity of transfers. If the delay interrupt is enabled, the delay timer starts counting down once a transfer has started. If the interval between transfers is longer than the delay counter, the delay interrupt is fired.

By default, the delay counter is zero, which means the delay interrupt is disabled. To enable delay interrupt, the delay interrupt enable bit must be set and the delay counter must be set to a value between 1 to 255.

BD management for SG DMA Transfers

BD is shared by the software and the hardware. To use BD for SG DMA transfers, the application needs to use the driver API to do the following:

  • Setup the BD ring:

    • XAxiCdma_BdRingCreate(...)

    Note that the memory for the BD ring is allocated and is later de-allocated by the application.

  • Request BD from the BD ring, more than one BDs can be requested at once:
    • XAxiCdma_BdRingAlloc(...)
  • Prepare BDs for the transfer, one BD at a time:
    • XAxiCdma_BdSetSrcBufAddr(...)
    • XAxiCdma_BdSetDstBufAddr(...)
    • XAxiCdma_BdSetLength(...)
  • Submit all prepared BDs to the hardware:
    • XAxiCdma_BdRingToHw(...)
  • Upon transfer completion, the application can request completed BDs from the hardware:
    • XAxiCdma_BdRingFromHw(...)
  • After the application has finished using the BDs, it should free the BDs back to the free pool:
    • XAxiCdma_BdRingFree(...)

The driver also provides API functions to get the status of a completed BD, along with get functions for other fields in the BD.

The following two diagrams show the correct flow of BDs:

The first diagram shows a complete cycle for BDs, starting from requesting the BDs to freeing the BDs.

         XAxiCdma_BdRingAlloc()                   XAxiCdma_BdRingToHw()
 Free ------------------------> Pre-process ----------------------> Hardware
                                                                    |
  /|\                                                               |
   |   XAxiCdma_BdRingFree()                XAxiCdma_BdRingFromHw() |
   +--------------------------- Post-process <----------------------+
 

The second diagram shows when a DMA transfer is to be cancelled before enqueuing to the hardware, application can return the requested BDs to the free group using XAxiCdma_BdRingUnAlloc().

         XAxiCdma_BdRingUnAlloc()
   Free <----------------------- Pre-process
 

Physical/Virtual Addresses

Addresses for the transfer buffers are physical addresses.

For SG transfers, the next BD pointer in a BD is also a physical address.

However, application's reference to a BD and to the transfer buffers are through virtual addresses.

The application is responsible to translate the virtual addresses of its transfer buffers to physical addresses before handing them to the driver.

For systems where MMU is not used, or MMU is a direct mapping, then the physical address and the virtual address are the same.

Cache Coherency

To prevent cache and memory inconsistency:

  • Flush the transmit buffer range before the transfer
  • Invalidate the receive buffer range before passing it to the hardware and before passing it to the application

For SG transfers:

  • Flush the BDs once the preparation setup is done
  • Invalidate the memory region for BDs when BDs are retrieved from the hardware.

BD alignment for SG Transfers

The hardware has requirement for the minimum alignment of the BDs, XAXICDMA_BD_MINIMUM_ALIGNMENT. It is OK to have an alignment larger than the required minimum alignment, however, it must be multiple of the minimum alignment. The alignment is passed into XAxiCdma_BdRingCreate().

Error Handling

The hardware halts upon all error conditions. The driver will reset the hardware once the error occurs.

The IrqMask argument in the callback function notifies the application about error conditions for the transfer.

Mutual Exclusion

The driver does not provide mutual exclusion mechanisms, it is up to the upper layer to handle this.

Hardware Defaults & Exclusive Use

The hardware is in the following condition on start or after a reset:

  • All interrupts are disabled.
  • The engine is in simple mode.
  • Interrupt coalescing counter is one.
  • Delay counter is 0.

The driver has exclusive use of the hardware registers and BDs. Accessing the hardware registers or the BDs should always go through the driver API functions.

Hardware Features That User Should Be Aware of

For performance reasons, the driver does not check the submission of transfers during run time. It is the user's responsibility to submit approrpiate transfers to the hardware. The following hardware features should be considerred when submitting a transfer:

. Whether the hardware supports unaligned transfers, reflected through C_INCLUDE_DRE in system.mhs file. Submitting unaligned transfers while the hardware does not support it, causes errors upon transfer submission. Aligned transfer is in respect to word length, and word length is defined through the building parameter XPAR_AXI_CDMA_0_M_AXI_DATA_WIDTH.

. Memory range of the transfer addresses. Transfer data to executable memory can crash the system.

. Lite mode. To save hardware resources (drastically), you may select "lite" mode build of the hardware. However, with lite mode, the following features are not supported:

  • Cross page boundary transfer. Each transfer must be restrictly inside one page; otherwise, slave error occurs.
  • Unaligned transfer.
  • Data width larger than 64 bit
  • Maximum transfer length each time is limited to data_width * burst_len
 MODIFICATION HISTORY:
  . Updated the debug print on type casting to avoid warnings on u32. Cast
    u32 to (unsigned int) to use the x format.
 Ver   Who  Date     Changes


1.00a jz 07/08/10 First release 2.01a rkv 01/25/11 Added TCL script to generate Test App code for peripheral tests. Replaced with "\r\n" in place on "\n\r" in printf statements. Made some minor modifications for Doxygen 2.02a srt 01/18/13 Added support for Key Hole feature (CR: 687217). Updated DDR base address for IPI designs (CR 703656). 2.03a srt 04/13/13 Removed Warnings (CR 705006). Added logic to check if DDR is present in the test app tcl file. (CR 700806) 3.0 adk 19/12/13 Updated as per the New Tcl API's 4.0 adk 27/07/15 Added support for 64-bit Addressing. 4.1 sk 11/10/15 Used UINTPTR instead of u32 for Baseaddress CR# 867425. Changed the prototype of XAxiCdma_CfgInitialize API. 4.3 mi 09/21/16 Fixed compilation warnings. ms 01/22/17 Modified xil_printf statement in main function for all examples to ensure that "Successfully ran" and "Failed" strings are available in all examples. This is a fix for CR-965028. ms 03/17/17 Added readme.txt file in examples folder for doxygen generation. ms 04/05/17 Modified Comment lines in functions of axicdma examples to recognize it as documentation block for doxygen generation of examples. 4.10 sa 08/12/22 Updated the examples to use latest MIG cannoical define i.e XPAR_MIG_0_C0_DDR4_MEMORY_MAP_BASEADDR.