Use Cases
This document outlines the operational use cases for the MDB5-DMA client driver, describing data transfer flows and architectural patterns. The MDB5-DMA Controller supports up to 8 read channels and 8 write channels per controller, with transfer sizes ranging from 1 byte to 4GB.
Host-to-Card (H2C)
Host-to-Card (H2C) transfers move data from the host system’s memory to the PCIe device’s memory. This is the primary method for sending configuration data, commands, or data payloads to the device.
Scatter-Gather (Linked-list) DMA
Scatter-Gather (Linked-list) DMA mode utilizes linked-list operation to handle fragmented buffers efficiently. The driver creates scatter-gather lists from user buffers and programs multiple descriptors as needed.
Scatter-Gather (Linked-list) DMA Data Flow
Application opens the H2C channel device node (e.g.,
/dev/mdb5_write00)Application issues a
write()system call with the user-space bufferMDB5-DMA client driver receives the buffer and validates parameters
Driver pins user-space memory pages and creates Scatter-Gather (Linked-list) lists
DMA descriptors are prepared and submitted to the underlying
dw-edmacontroller via the DMA Engine APIH2C MM Engine reads data from host memory locations and writes to device memory through PCIe interface
Upon transfer completion, DMA engine updates completion status and generates interrupt if required
Driver processes completion status and returns the number of bytes transferred to the application
Buffer Segmentation
When a user buffer is submitted for a scatter-gather transfer, the MDB5-DMA client driver breaks it down into a scatter-gather list for efficient DMA processing. The driver’s segmentation process follows these steps:
The driver pins user-space memory pages using
get_user_pages_fast()to prevent the operating system from moving or swapping them during the transferThis driver implementation uses PAGE_SIZE as the standard size for each scatter-gather entry (typically 4KB on x86 systems, though this varies by architecture)
While PAGE_SIZE alignment is not a hardware requirement and entry sizes can be configured differently if needed, this driver follows the PAGE_SIZE pattern
Entry sizes vary based on buffer alignment relative to page boundaries
The first entry may be partial if the buffer starts mid-page
Middle entries typically span complete pages when properly aligned
The final entry contains any remaining data that doesn’t fill a complete page
Example: A 10KB buffer starting at a 100-byte offset within a page would create:
Entry 0: 3,996 bytes (from offset 100 to end of first page)
Entry 1: 4,096 bytes (complete second page)
Entry 2: 1,908 bytes (remaining data on third page)
Configuration
Channels default to Scatter-Gather (Linked-list) Mode. The aperture size can be configured to optimize performance for specific transfer patterns using the control device interface.
struct ctrl_mode mode = {
.name = "/dev/mdb5_write00",
.mode = MDB5_MODE_SG
};
ioctl(ctrl_fd, IOCTL_MDB5_SET_TRANSFER_MODE, &mode);
Simple (Non Linked-list) Mode
In Simple (Non Linked-list) Mode, the MDB5-DMA driver operates in non-linked-list mode for direct single-buffer transfers. This mode is optimal for a single large chunk as it reduces the overhead of setting up Link List pointer (LLP).
The channel must be configured for Simple (Non Linked-list) Mode using the control device interface before performing transfers.
struct ctrl_mode mode = {
.name = "/dev/mdb5_write00",
.mode = MDB5_MODE_SIMPLE
};
ioctl(ctrl_fd, IOCTL_MDB5_SET_TRANSFER_MODE, &mode);
Simple (Non Linked-list) Mode data flow
Application opens the H2C channel device node
Channel is configured for Simple (Non Linked-list) Mode operation
Application issues a
write()system call with a contiguous bufferDriver maps the buffer directly without Scatter-Gather (Linked-list) list creation
Single DMA descriptor is prepared and submitted
H2C engine performs direct transfer from host to device memory
Completion notification is provided upon transfer completion
ASYNC IO
Asynchronous H2C operations enable non-blocking transfers using vectored I/O interfaces.
Async Data Flow
AIO context is established using
io_setup()I/O Control Blocks (iocb) are prepared for each transfer operation using
io_prep_pwriteTransfers are submitted using
io_submit(), returning immediately to allow continued application executionDriver and DMA hardware process transfers asynchronously in the background
Application uses
io_getevents()to check for completed operations and retrieve results
Card-to-Host (C2H)
Card-to-Host (C2H) transfers move data from the PCIe device’s memory to the host system’s memory. This is essential for reading results, status information, or streaming data from the device.
Scatter-Gather (Linked-list) DMA
Scatter-Gather (Linked-list) DMA C2H mode efficiently handles large or fragmented read operations using linked-list descriptors. The driver manages Scatter-Gather (Linked-list) list creation and DMA mapping automatically.
Buffer Segmentation
When a user provides a destination buffer for a scatter-gather read transfer, the MDB5-DMA client driver segments it into a scatter-gather list for efficient DMA processing. The driver’s segmentation process for receiving data follows these steps:
The driver pins the destination buffer’s memory pages using
get_user_pages_fast()to prevent the operating system from moving or swapping them during the transferThis driver implementation uses PAGE_SIZE as the standard size for each scatter-gather entry (typically 4KB on x86 systems, though this varies by architecture)
While PAGE_SIZE alignment is not a hardware requirement and entry sizes can be configured differently if needed, this driver follows the PAGE_SIZE pattern for optimal memory management
Entry sizes vary based on buffer alignment relative to page boundaries
The first entry may be partial if the destination buffer starts mid-page
Middle entries typically span complete pages when properly aligned
The final entry contains space for any remaining data that doesn’t fill a complete page
Example: A 10KB destination buffer starting at a 100-byte offset within a page would create:
Entry 0: 3,996 bytes (from offset 100 to end of first page)
Entry 1: 4,096 bytes (complete second page)
Entry 2: 1,908 bytes (remaining space on third page)
Scatter-Gather (Linked-list) DMA Data Flow
The complete flow between the host components and hardware components follows this sequence:
Application opens the C2H channel device node (e.g., /dev/mdb5_read00)
Application issues a
read()system call with a destination bufferMDB5-DMA client driver validates the read request and buffer parameters
Driver pins user buffer pages and creates Scatter-Gather (Linked-list) lists
C2H DMA descriptors are prepared and submitted to the DMA engine
C2H MM Engine reads data from device memory and writes to host memory locations
Upon transfer completion, DMA engine updates completion status and generates interrupt if required
Driver processes completion status, unmaps pages, and returns the number of bytes transferred
Simple (Non Linked-list) Mode
Simple (Non Linked-list) Mode C2H operations use non-linked-list mode for direct buffer transfers. This mode is optimal for a single large chunk as it reduces the overhead of setting up Link List pointer (LLP).
Similar to H2C operations, C2H channels must be explicitly configured for Simple (Non Linked-list) Mode.
struct ctrl_mode mode = {
.name = "/dev/mdb5_read00",
.mode = MDB5_MODE_SIMPLE
};
ioctl(ctrl_fd, IOCTL_MDB5_SET_TRANSFER_MODE, &mode);
Simple (Non Linked-list) Mode data flow
Application opens the C2H channel device node
Channel is configured for Simple (Non Linked-list) Mode operation
Application issues a
read()system call with a contiguous bufferDriver maps the destination buffer directly
Single DMA descriptor is prepared for the transfer
C2H engine reads from device memory and writes to host buffer
Completion status is returned with the number of bytes transferred
ASYNC IO
Asynchronous C2H operations enable non-blocking reads, particularly useful for streaming data applications where continuous data flow from the device is required without blocking the application.
Async Data Flow
Application opens the C2H channel with direct I/O flags
AIO context is created for managing asynchronous operations
Read operations are prepared using
io_prep_preadand submitted viaio_submit()C2H engine performs transfers while application continues execution
Application receives completion events via
io_getevents()when data is available