This is a dialect for experimental work related to AIEngine processors. The expectation is that new ideas can be developed here before migration to the more mature AIE dialect.
[TOC]
aiex.bp_dest (::xilinx::AIEX::BPDestOp)A destination port
Syntax:
operation ::= `aiex.bp_dest` `<` $tile `,` $bundle `:` $channel `>` attr-dict
An object representing the destination of a Broad Packet. This must exist within an [AIE.bp_id] operation. See [AIE.broadcast_packet] for an example.
Traits: HasParent<BPIDOp>
| Attribute | MLIR Type | Description |
|---|---|---|
bundle | xilinx::AIE::WireBundleAttr | Bundle of wires |
channel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
tile |
index |
aiex.bp_id (::xilinx::AIEX::BPIDOp)A set of packets that share the same ID
Syntax:
operation ::= `aiex.bp_id` `(` $ID `)` regions attr-dict
A set of destination packets that share the same source and ID. This must exist within an [AIE.broadcast_packet] operation. See [AIE.broadcast_packet]for an example.
Traits: SingleBlockImplicitTerminator<AIE::EndOp>, SingleBlock
| Attribute | MLIR Type | Description |
|---|---|---|
ID | ::mlir::IntegerAttr | 8-bit signless integer attribute |
aiex.broadcast_packet (::xilinx::AIEX::BroadcastPacketOp)Combination of broadcast and packet-switch
Syntax:
operation ::= `aiex.broadcast_packet` `(` $tile `,` $bundle `:` $channel `)` regions attr-dict
An abstraction of broadcast and packet-switched flow. During place and route, it will be replaced by packet-switched flow and further replaced by MasterSets and PacketRules inside switchboxes.
Example:
%70 = AIE.tile(7, 0)
%73 = AIE.tile(7, 3)
%74 = AIE.tile(7, 4)
%63 = AIE.tile(6, 3)
%64 = AIE.tile(6, 4)
AIE.broadcast_packet(%70, "DMA" : 0){
AIE.bp_id(0x0){
AIE.bp_dest<%73, "DMA" : 0>
AIE.bp_dest<%63, "DMA" : 0>
}
AIE.bp_id(0x1){
AIE.bp_dest<%74, "DMA" : 0>
AIE.bp_dest<%64, "DMA" : 0>
}
}
Traits: SingleBlockImplicitTerminator<AIE::EndOp>, SingleBlock
| Attribute | MLIR Type | Description |
|---|---|---|
bundle | xilinx::AIE::WireBundleAttr | Bundle of wires |
channel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
tile |
index |
aiex.connection (::xilinx::AIEX::ConnectionOp)A logical circuit-switched connection between cores
Syntax:
operation ::= `aiex.connection` `(` $source `,` $sourceBundle `:` $sourceChannel `,` $dest `,` $destBundle `:` $destChannel `)` attr-dict
The “aie.connection” operation represents a circuit switched connection between two endpoints, usually “aie.core” operations. During routing, this is replaced by “aie.connect” operations which represent the programmed connections inside a switchbox, along with “aie.wire” operations which represent physical connections between switchboxes and other components. Note that while “aie.flow” operations can express partial routes between tiles, this is not possible with “aie.connection” operations.
Example: %22 = aie.tile(2, 2) %c22 = aie.core(%22) %11 = aie.tile(1, 1) %c11 = aie.core(%11) aie.flow(%c22, “Core” : 0, %c11, “Core” : 1)
| Attribute | MLIR Type | Description |
|---|---|---|
sourceBundle | xilinx::AIE::WireBundleAttr | Bundle of wires |
sourceChannel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
destBundle | xilinx::AIE::WireBundleAttr | Bundle of wires |
destChannel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
source |
index |
dest |
index |
aiex.control_packet (::xilinx::AIEX::NpuControlPacketOp)AIE control packet
Syntax:
operation ::= `aiex.control_packet` attr-dict
The control_packet operation represents a low-level AIE control packet header and payload.
| Attribute | MLIR Type | Description |
|---|---|---|
address | ::mlir::IntegerAttr | 32-bit unsigned integer attribute |
length | ::mlir::IntegerAttr | 32-bit signless integer attribute |
opcode | ::mlir::IntegerAttr | 32-bit signless integer attribute |
stream_id | ::mlir::IntegerAttr | 32-bit signless integer attribute |
data | ::mlir::DenseI32ArrayAttr | i32 dense array attribute |
aiex.dma_await_task (::xilinx::AIEX::DMAAwaitTaskOp)Await Completion of a Previously Submitted DMA Task
Syntax:
operation ::= `aiex.dma_await_task` `(` $task `)` attr-dict
This operation will block execution of the runtime sequence until the referenced previously started DMA task has completed.
DMA tasks can be started using aiex.start_task using abstract BD chains declared using aie.bd_chain, or using aiex.start_configured_task using a manually configured task.
To be able to wait on a task, it must issue a task completion token (TCT). Tasks only emit these tokens if the attribute issue_token is set to true.
Traits: HasParent<RuntimeSequenceOp>
| Operand | Description |
|---|---|
task |
index |
aiex.dma_configure_task (::xilinx::AIEX::DMAConfigureTaskOp)Concrete Instantiation of a Buffer Descriptor Chain as a Task on a Channel and Direction on a Tile
Syntax:
operation ::= `aiex.dma_configure_task` `(` $tile `,` $direction `,` $channel `)` regions attr-dict
Encapsulates the DMA configuration of one task, that is the (chain of) buffer descriptors to be executed on a given channel and direction on a tile.
Such configurations are generated by materializing abstract aie.bd_chains using aiex.start_task, or can be created manually using this op.
Once configured, a task can be submitted for execution using aiex.dma_start_configured_task, after which its execution completion can be awaited using aiex.dma_await_task.
Traits: HasParent<RuntimeSequenceOp>
Interfaces: OpAsmOpInterface, TileElement
| Attribute | MLIR Type | Description |
|---|---|---|
direction | xilinx::AIE::DMAChannelDirAttr | DMA Channel direction |
channel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
issue_token | ::mlir::BoolAttr | bool attribute |
repeat_count | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
tile |
index |
| Result | Description |
|---|---|
result |
index |
aiex.dma_configure_task_for (::xilinx::AIEX::DMAConfigureTaskForOp)As dma_configure_task, but specify tile, direction and channel by reference to a Shim DMA allocation op
Syntax:
operation ::= `aiex.dma_configure_task_for` $alloc regions attr-dict
Traits: HasParent<RuntimeSequenceOp>
| Attribute | MLIR Type | Description |
|---|---|---|
alloc | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
issue_token | ::mlir::BoolAttr | bool attribute |
repeat_count | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Result | Description |
|---|---|
result |
index |
aiex.dma_free_task (::xilinx::AIEX::DMAFreeTaskOp)Free all Buffer Descriptor IDs Associated with the Given Task
Syntax:
operation ::= `aiex.dma_free_task` `(` $task `)` attr-dict
This operation informs the static buffer descriptor allocator pass in the compiler that the buffer descriptor IDs it has allocated to the BDs inside the referenced task can be reused thereafter.
Potential future implementations of dynamic buffer descriptor allocators may lower this to a free instruction.
Traits: HasParent<RuntimeSequenceOp>
| Operand | Description |
|---|---|
task |
index |
aiex.dma_start_bd_chain (::xilinx::AIEX::DMAStartBdChainOp)Materialize an Abstract BD Chain as a DMA Task on the Given Tile, Channel and Direction and Immediately Start It
Syntax:
operation ::= `aiex.dma_start_bd_chain` $symbol `(` $args `)` `:` `(` type($args) `)` ` ` `on` ` ` `(` $tile `,` $direction `,` $channel `)` attr-dict
This operation will configure a new DMA task on the given tile, channel and direction by concretizing an abstract BD chain, previously defined using aie.bd_chain, with the given input arguments.
Completion of the DMA task, i.e. the data transfer, can be awaited using aiex.await_task if the attribute issue_token is set to true.
Traits: HasParent<RuntimeSequenceOp>
Interfaces: OpAsmOpInterface, TileElement
| Attribute | MLIR Type | Description |
|---|---|---|
symbol | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
direction | xilinx::AIE::DMAChannelDirAttr | DMA Channel direction |
channel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
issue_token | ::mlir::BoolAttr | bool attribute |
repeat_count | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
args |
variadic of any type |
tile |
index |
| Result | Description |
|---|---|
result |
index |
aiex.dma_start_bd_chain_for (::xilinx::AIEX::DMAStartBdChainForOp)As dma_start_bd_chain, but specify tile, direction and channel by reference to a Shim DMA allocation op
Syntax:
operation ::= `aiex.dma_start_bd_chain_for` $symbol `(` $args `)` `:` `(` type($args) `)` ` ` `for` ` ` $alloc attr-dict
Traits: HasParent<RuntimeSequenceOp>
| Attribute | MLIR Type | Description |
|---|---|---|
symbol | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
alloc | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
issue_token | ::mlir::BoolAttr | bool attribute |
repeat_count | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
args |
variadic of any type |
| Result | Description |
|---|---|
result |
index |
aiex.dma_start_task (::xilinx::AIEX::DMAStartTaskOp)Submit a Preconfigured Task to the Task Queue
Syntax:
operation ::= `aiex.dma_start_task` `(` $task `)` attr-dict
Submits the referenced task for execution on the tile, channel and direction it has been configured to run on.
Once submitted, if the task is configured to issue a token, you can await completion of the task using aiex.await_task.
Traits: HasParent<RuntimeSequenceOp>
| Operand | Description |
|---|---|
task |
index |
aiex.getTile (::xilinx::AIEX::GetTileOp)Get a reference to an AIE tile
Syntax:
operation ::= `aiex.getTile` `(` $col `,` $row `)` attr-dict
Return a reference to an AIE tile, given the column and the row of the tile.
| Operand | Description |
|---|---|
col |
index |
row |
index |
| Result | Description |
|---|---|
result |
index |
aiex.herd (::xilinx::AIEX::HerdOp)Declare a herd which is a bundle of core organized in a rectangular shape
Syntax:
operation ::= `aiex.herd` `[` $width `]` `[` $height `]` attr-dict
This operation creates a group of AIE tiles in 2D shape.
Example: %herd0 = AIE.herd[1][1] // a single AIE tile. location unknown %herd1 = AIE.herd[4][1] // a row of four-AIE tile
The operation can be used in replacement of a TileOp – in case we want to select a group of hardware entities (cores, mems, switchboxes) instead of individual entity, and we don’t want to specify their locations just yet. This can be useful if we want to generate parameterizable code (the column and row values are parameterized).
Example:
%herd = AIE.herd[2][2] // a herd of 2x2 AIE tiles
AIE.core(%herd) { // all the cores belong to this herd runs the same code }
| Attribute | MLIR Type | Description |
|---|---|---|
width | ::mlir::IntegerAttr | 32-bit signless integer attribute |
height | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Result | Description |
|---|---|
| «unnamed» | index |
aiex.iter (::xilinx::AIEX::IterOp)An iter operation
Syntax:
operation ::= `aiex.iter` `(` $start `,` $end `,` $stride `)` attr-dict
This operation generates index values that can be used with the SelectOp to select a group of tiles from a herd.
Example: %iter0 = AIE.iter(0, 15, 1) // 0, 1, 2, … , 15 %iter1 = AIE.iter(2, 8, 2) // 2, 4, 6
| Attribute | MLIR Type | Description |
|---|---|---|
start | ::mlir::IntegerAttr | 32-bit signless integer attribute |
end | ::mlir::IntegerAttr | 32-bit signless integer attribute |
stride | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Result | Description |
|---|---|
| «unnamed» | index |
aiex.memcpy (::xilinx::AIEX::MemcpyOp)A memcpy op
Syntax:
operation ::= `aiex.memcpy` $tokenName `(` $acqValue `,` $relValue `)` `(`
$srcTile `:` `<` $srcBuf `,` $srcOffset `,` $srcLen `>` `,`
$dstTile `:` `<` $dstBuf `,` $dstOffset `,` $dstLen `>` `)`
attr-dict `:` `(` type($srcBuf) `,` type($dstBuf) `)`
This operation defines a logical data transfer of a buffer from a source tile to another buffer from a destination tile.
This operation should be lowered to Mem ops with DMA setup and Flow ops for routing data from the source tile to the dest. tile.
| Attribute | MLIR Type | Description |
|---|---|---|
tokenName | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
acqValue | ::mlir::IntegerAttr | 32-bit signless integer attribute |
relValue | ::mlir::IntegerAttr | 32-bit signless integer attribute |
srcOffset | ::mlir::IntegerAttr | 32-bit signless integer attribute |
srcLen | ::mlir::IntegerAttr | 32-bit signless integer attribute |
dstOffset | ::mlir::IntegerAttr | 32-bit signless integer attribute |
dstLen | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
srcTile |
index |
srcBuf |
memref of any type values |
dstTile |
index |
dstBuf |
memref of any type values |
aiex.multi_dest (::xilinx::AIEX::MultiDestOp)A destination port of multicast flow
Syntax:
operation ::= `aiex.multi_dest` `<` $tile `,` $bundle `:` $channel `>` attr-dict
An object representing the destination of a multicast flow. This must exist within an [aiex.multicast] operation. There can be multiple destinations within an aiex.multicast Op.
See [aiex.multicast]for an example.
Traits: HasParent<MulticastOp>
| Attribute | MLIR Type | Description |
|---|---|---|
bundle | xilinx::AIE::WireBundleAttr | Bundle of wires |
channel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
tile |
index |
aiex.multicast (::xilinx::AIEX::MulticastOp)An abstraction of multicast
Syntax:
operation ::= `aiex.multicast` `(` $tile `,` $bundle `:` $channel `)` regions attr-dict
An abstraction of broadcast. During place and route, it will be replaced by multiple flows.
Example:
%70 = AIE.tile(7, 0)
%73 = AIE.tile(7, 3)
%74 = AIE.tile(7, 4)
%63 = AIE.tile(6, 3)
%64 = AIE.tile(6, 4)
aiex.multicast(%70, "DMA" : 0){
aiex.multi_dest<%73, "DMA" : 0>
aiex.multi_dest<%74, "DMA" : 0>
aiex.multi_dest<%63, "DMA" : 0>
aiex.multi_dest<%64, "DMA" : 0>
}
Traits: SingleBlockImplicitTerminator<AIE::EndOp>, SingleBlock
| Attribute | MLIR Type | Description |
|---|---|---|
bundle | xilinx::AIE::WireBundleAttr | Bundle of wires |
channel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
tile |
index |
aiex.npu.address_patch (::xilinx::AIEX::NpuAddressPatchOp)Address patch operator
Syntax:
operation ::= `aiex.npu.address_patch` attr-dict
address patch operator
| Attribute | MLIR Type | Description |
|---|---|---|
addr | ::mlir::IntegerAttr | 32-bit unsigned integer attribute |
arg_idx | ::mlir::IntegerAttr | 32-bit signless integer attribute |
arg_plus | ::mlir::IntegerAttr | 32-bit signless integer attribute |
aiex.npu.blockwrite (::xilinx::AIEX::NpuBlockWriteOp)Blockwrite operator
Syntax:
operation ::= `aiex.npu.blockwrite` `(` $data `)` attr-dict `:` type($data)
blockwrite operator writes the data from the memref ‘data’ to the AIE array. If ‘buffer’ is present then ‘address’ is interpreted as an offset into the aie.buffer with symbol name ‘buffer’. If ‘column’ and ‘row’ are present then ‘address’ is interpreted as an offset into the memory space of aie.tile(column, row). If ‘buffer’ is not present and ‘column’ and ‘row’ are not present then ‘address’ is interpreted as a full 32-bit address in the AIE array.
| Attribute | MLIR Type | Description |
|---|---|---|
address | ::mlir::IntegerAttr | 32-bit unsigned integer attribute |
buffer | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
column | ::mlir::IntegerAttr | 32-bit signless integer attribute |
row | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
data |
memref of any type values |
aiex.npu.dma_memcpy_nd (::xilinx::AIEX::NpuDmaMemcpyNdOp)Half DMA operator
Syntax:
operation ::= `aiex.npu.dma_memcpy_nd` `(` $memref ``
custom<DynamicIndexList>($offsets, $static_offsets) ``
custom<DynamicIndexList>($sizes, $static_sizes) ``
custom<DynamicIndexList>($strides, $static_strides) ``
(`,` `packet` `=` $packet^)? `)`
attr-dict `:` type($memref)
An n-dimensional half DMA operator.
Programs a DMA to access a memory memref with an access pattern specified by offsets,
sizes and strides or static_offsets, static_sizes and static_strides. The operator
references the target DMA coordinates (x, y) and channel through the metadata
symbol and specifies a descriptor id to be used, which will become the bd_id to be used
when lowered further. The issue_token attribute specifies whether the execution of this
operation should issue a token which can be received and read for synchronization purposes.
This issue_token attribute is set to false by default for MM2S for backward compatibility
and is always set to true for S2MM channels.
The burst length attribute specifies the burst length in bytes for the DMA operation. A value
of 0 indicates that the burst length is not specified and the maximal burst length is used.
metadata – Specifying Tile, Channel, Direction and Linking a dma_memcpy_nd to its Other HalfThe metadata attribute must point to a symbol referencing a
aie.shim_dma_allocation operation.
The tile coordinates of the DMA to configure, the channel number and the direction (MM2S or S2MM) are taken from this operation.
To connect the DMA to its other half (i.e. a MM2S DMA to its receiving end and a S2MM to the sending end),
the user must configure a flow (aie.flow) between the tile and channel referenced in the aie.shim_dma_allocation and the corresponding other end.
When using ObjectFIFOs, the aie.shim_dma_allocation operations and the aie.flows are generated automatically.
The symbol of the aie.objectfifo (create) operation can be used directly in metadata in this case.
When the dma_memcpy_nd operation executes, it immediately reprograms the buffer descriptor with ID bd_id on tile (x, y), even if that buffer descriptor is currently executing.
Without proper synchronization, this inevitably leads to nondeterministic results.
Programming a buffer descriptor that is not currently executing is harmless.
Thus, the first dma_memcpy_nd call for each bd_id requires no synchronization.
However, if you wish to later re-use a bd_id on the same tile, you must wait for the previous buffer descriptor to complete.
The sync or dma_wait operations can be used for this.
sync blocks until it receives a task completion token (TCT).
To properly synchronize, you must thus configure your BD to issue a TCT using the issue_token attribute, then wait on that token before reusing the BD.
dma_wait is a convenience operation that lowers to the corresponding sync operation for the refrenced symbol.
Note that if you have multiple concurrently running BDs and you can reason one BD will always complete after all others, it is not strictly necessary to issue and wait on the TC token for every BD. For example, if you have input and output BDs on the shim, and you know the cores will only push output onto the output BD after the input BDs have completed, it may be sufficient to synchronize only on the output BD before reusing input BDs.
The sizes and strides attributes describe a data layout transformation to be performed by the DMA.
These transformations are described in more depth in the documentation for the
aie.dma_bd operation.
Note that the syntax here differs from that of the dma_bd operation:
offsets and strides are given as separate arrays instead of tuples.
The offsets array is used to calculate a static offset into the memref.
Each offset in the array is understood in relation to the shape of the memref;
the lowest-dimension offset is a direct offset in units of memref element type, and the higher dimensions are multiplied by the size of the memref in those dimensions.
Note that this is for convenience of the user only.
The hardware only supports a single static offset, and this offset is calculated at compile time.
Thus, all offsets can be equivalently expressed with the lowest dimension only.
The optional packet attribute defines the packet header and packet type that gets issued per DMA BD.
If the attribute is set, then every time the DMA BD gets issued, a packet header is generated prior to the transmission of data.
The packet header is used to guide arbitration throughout a packet-routed data flow, where each switch box arbitrates the data packet to stream to a successor based on the packet header.
Traits: AttrSizedOperandSegments
Interfaces: MyOffsetSizeAndStrideOpInterface
| Attribute | MLIR Type | Description |
|---|---|---|
static_offsets | ::mlir::DenseI64ArrayAttr | i64 dense array attribute with exactly 4 elements |
static_sizes | ::mlir::DenseI64ArrayAttr | i64 dense array attribute with exactly 4 elements |
static_strides | ::mlir::DenseI64ArrayAttr | i64 dense array attribute with exactly 4 elements |
packet | ::xilinx::AIE::PacketInfoAttr | Tuple encoding the type and header of a packet; |
metadata | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
id | ::mlir::IntegerAttr | 64-bit signless integer attribute |
issue_token | ::mlir::BoolAttr | bool attribute |
d0_zero_before | ::mlir::IntegerAttr | 64-bit signless integer attribute |
d1_zero_before | ::mlir::IntegerAttr | 64-bit signless integer attribute |
d2_zero_before | ::mlir::IntegerAttr | 64-bit signless integer attribute |
d0_zero_after | ::mlir::IntegerAttr | 64-bit signless integer attribute |
d1_zero_after | ::mlir::IntegerAttr | 64-bit signless integer attribute |
d2_zero_after | ::mlir::IntegerAttr | 64-bit signless integer attribute |
burst_length | ::mlir::IntegerAttr | 64-bit signless integer attribute |
| Operand | Description |
|---|---|
memref |
ranked or unranked memref of any type values |
offsets |
variadic of 64-bit signless integer |
sizes |
variadic of 64-bit signless integer |
strides |
variadic of 64-bit signless integer |
aiex.npu.dma_wait (::xilinx::AIEX::NpuDmaWaitOp)Blocking operation to wait for a DMA to complete execution.
Syntax:
operation ::= `aiex.npu.dma_wait` attr-dict
The NpuDmaWaitOp blocks until the DMA referenced through symbol completes execution
and issues a task-complete-token (TCT).
symbol is a reference to a aie.shim_dma_allocation, which contains information about the column, channel and channel direction on which to wait for a TCT.
The aie.shim_dma_allocation may be generated from an ObjectFIFO, in which case you can directly pass the ObjectFIFO symbol refrence.
npu.dma_wait will be lowered to the corresponding npu.sync operation using the information from symbol.
Example:
...
aie.objectfifo @out0(%tile_0_1, {%tile_0_0}, 4 : i32) : !aie.objectfifo<memref<32x32xi32>>
...
aiex.npu.dma_memcpy_nd(0, 0, %arg2[1, 1, 0, 0][1, 1, 32, 32][1, 1, 64, 1]) {id = 0 : i64, issue_token = true, metadata = @out0} : memref<32x64xi32>
...
aiex.npu.dma_wait { symbol = @out0 }
Here, we have an objectfifo with symbol name out0, which is then referenced in the
npu.dma_memcpy_nd operation as the target for the respective DMA operation. Afterwards,
an npu.dma_wait operation references the same symbol to block until the respective DMA
has executed all of its tasks.
| Attribute | MLIR Type | Description |
|---|---|---|
symbol | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
aiex.npu.maskwrite32 (::xilinx::AIEX::NpuMaskWrite32Op)Write a masked 32-bit value to the AIE array
Syntax:
operation ::= `aiex.npu.maskwrite32` attr-dict
NPU mask write32 operator writes a masked 32bit value to the AIE array. If ‘buffer’ is present then ‘address’ is interpreted as an offset into the aie.buffer with symbol name ‘buffer’. If ‘column’ and ‘row’ are present then ‘address’ is interpreted as an offset into the memory space of aie.tile(column, row). If ‘buffer’ is not present and ‘column’ and ‘row’ are not present then ‘address’ is interpreted as a full 32-bit address in the AIE array.
| Attribute | MLIR Type | Description |
|---|---|---|
address | ::mlir::IntegerAttr | 32-bit unsigned integer attribute |
value | ::mlir::IntegerAttr | 32-bit unsigned integer attribute |
mask | ::mlir::IntegerAttr | 32-bit unsigned integer attribute |
buffer | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
column | ::mlir::IntegerAttr | 32-bit signless integer attribute |
row | ::mlir::IntegerAttr | 32-bit signless integer attribute |
aiex.npu.preempt (::xilinx::AIEX::NpuPreemptOp)Preempt transaction operation
Syntax:
operation ::= `aiex.npu.preempt` attr-dict
Yield to higher priority task(s). Indicates to the transaction processor that the instruction stream can be interrupted at this point. Levels: 0: Noop. 1: Mem tile. 2: AIE tile. 3: AIE registers.
| Attribute | MLIR Type | Description |
|---|---|---|
level | ::mlir::IntegerAttr | 8-bit unsigned integer attribute |
aiex.npu.push_queue (::xilinx::AIEX::NpuPushQueueOp)Bd queue push operator
Syntax:
operation ::= `aiex.npu.push_queue` `(` $column `,` $row `,` $direction `:` $channel `)` attr-dict
bd queue push operator
| Attribute | MLIR Type | Description |
|---|---|---|
column | ::mlir::IntegerAttr | 32-bit signless integer attribute |
row | ::mlir::IntegerAttr | 32-bit signless integer attribute |
direction | xilinx::AIE::DMAChannelDirAttr | DMA Channel direction |
channel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
issue_token | ::mlir::BoolAttr | bool attribute |
repeat_count | ::mlir::IntegerAttr | 32-bit signless integer attribute |
bd_id | ::mlir::IntegerAttr | 32-bit signless integer attribute |
aiex.npu.rtp_write (::xilinx::AIEX::NpuWriteRTPOp)Rtp write operator
Syntax:
operation ::= `aiex.npu.rtp_write` `(` $buffer `,` $index `,` $value `)` attr-dict
rtp write operator
| Attribute | MLIR Type | Description |
|---|---|---|
buffer | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
index | ::mlir::IntegerAttr | 32-bit unsigned integer attribute |
value | ::mlir::IntegerAttr | 32-bit signless integer attribute |
aiex.npu.sync (::xilinx::AIEX::NpuSyncOp)Sync operator
Syntax:
operation ::= `aiex.npu.sync` attr-dict
The sync operation blocks execution of the instruction stream until a task-complete token (TCT) is received on column, row, channel channel, direction direction (where 0 is S2MM and 1 is MM2S).
If this operation appears to deadlock, ensure that at least one buffer descriptor is configured to issue a TCT on the channel you expect.
By default, dma_memcpy_nd operations only issue tokens for S2MM channels, and issue_token must be set to true to issue tokens for MM2S channels.
| Attribute | MLIR Type | Description |
|---|---|---|
column | ::mlir::IntegerAttr | 32-bit signless integer attribute |
row | ::mlir::IntegerAttr | 32-bit signless integer attribute |
direction | ::mlir::IntegerAttr | 32-bit signless integer attribute |
channel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
column_num | ::mlir::IntegerAttr | 32-bit signless integer attribute |
row_num | ::mlir::IntegerAttr | 32-bit signless integer attribute |
aiex.npu.write32 (::xilinx::AIEX::NpuWrite32Op)Write32 operator
Syntax:
operation ::= `aiex.npu.write32` attr-dict
NPU write32 operator writes a 32bit value to the AIE array. If ‘buffer’ is present then ‘address’ is interpreted as an offset into the aie.buffer with symbol name ‘buffer’. If ‘column’ and ‘row’ are present then ‘address’ is interpreted as an offset into the memory space of aie.tile(column, row). If ‘buffer’ is not present and ‘column’ and ‘row’ are not present then ‘address’ is interpreted as a full 32-bit address in the AIE array.
| Attribute | MLIR Type | Description |
|---|---|---|
address | ::mlir::IntegerAttr | 32-bit unsigned integer attribute |
value | ::mlir::IntegerAttr | 32-bit unsigned integer attribute |
buffer | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
column | ::mlir::IntegerAttr | 32-bit signless integer attribute |
row | ::mlir::IntegerAttr | 32-bit signless integer attribute |
aiex.npu.writebd (::xilinx::AIEX::NpuWriteBdOp)Dma operator
Syntax:
operation ::= `aiex.npu.writebd` attr-dict
writebd operator
| Attribute | MLIR Type | Description |
|---|---|---|
column | ::mlir::IntegerAttr | 32-bit signless integer attribute |
bd_id | ::mlir::IntegerAttr | 32-bit signless integer attribute |
buffer_length | ::mlir::IntegerAttr | 32-bit signless integer attribute |
buffer_offset | ::mlir::IntegerAttr | 32-bit signless integer attribute |
enable_packet | ::mlir::IntegerAttr | 32-bit signless integer attribute |
out_of_order_id | ::mlir::IntegerAttr | 32-bit signless integer attribute |
packet_id | ::mlir::IntegerAttr | 32-bit signless integer attribute |
packet_type | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d0_size | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d0_stride | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d1_size | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d1_stride | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d2_size | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d2_stride | ::mlir::IntegerAttr | 32-bit signless integer attribute |
iteration_current | ::mlir::IntegerAttr | 32-bit signless integer attribute |
iteration_size | ::mlir::IntegerAttr | 32-bit signless integer attribute |
iteration_stride | ::mlir::IntegerAttr | 32-bit signless integer attribute |
next_bd | ::mlir::IntegerAttr | 32-bit signless integer attribute |
row | ::mlir::IntegerAttr | 32-bit signless integer attribute |
use_next_bd | ::mlir::IntegerAttr | 32-bit signless integer attribute |
valid_bd | ::mlir::IntegerAttr | 32-bit signless integer attribute |
lock_rel_val | ::mlir::IntegerAttr | 32-bit signless integer attribute |
lock_rel_id | ::mlir::IntegerAttr | 32-bit signless integer attribute |
lock_acq_enable | ::mlir::IntegerAttr | 32-bit signless integer attribute |
lock_acq_val | ::mlir::IntegerAttr | 32-bit signless integer attribute |
lock_acq_id | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d0_zero_before | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d1_zero_before | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d2_zero_before | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d0_zero_after | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d1_zero_after | ::mlir::IntegerAttr | 32-bit signless integer attribute |
d2_zero_after | ::mlir::IntegerAttr | 32-bit signless integer attribute |
burst_length | ::mlir::IntegerAttr | 32-bit signless integer attribute |
aiex.place (::xilinx::AIEX::PlaceOp)A place operation that specifies the relative placement (XY) of one herd to another
Syntax:
operation ::= `aiex.place` `(` $sourceHerd `,` $destHerd `,` $distX `,` $distY `)` attr-dict
A place operation that specifies the relative placement (XY) of one herd to another.
| Attribute | MLIR Type | Description |
|---|---|---|
distX | ::mlir::IntegerAttr | 32-bit signless integer attribute |
distY | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
sourceHerd |
index |
destHerd |
index |
aiex.route (::xilinx::AIEX::RouteOp)A route operation that routes one herd to another
Syntax:
operation ::= `aiex.route` `(` `<` $sourceHerds `,` $sourceBundle `:` $sourceChannel `>` `,`
`<` $destHerds `,` $destBundle `:` $destChannel `>` `)` attr-dict
A route operation that routes one herd to another.
| Attribute | MLIR Type | Description |
|---|---|---|
sourceBundle | xilinx::AIE::WireBundleAttr | Bundle of wires |
sourceChannel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
destBundle | xilinx::AIE::WireBundleAttr | Bundle of wires |
destChannel | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
sourceHerds |
index |
destHerds |
index |
aiex.runtime_sequence (::xilinx::AIEX::RuntimeSequenceOp)Program the configuration co-processor of the AI Engine array
Instructions in this operation allow for runtime (re-)configuration of the AI Engine array, such as configuring data movement buffer descriptors. These instructions will execute on the configuration co-processor of the AI Engine array.
Typically, these instructions include configuring the data transfers between host and AIE array on the shims. The input arguments are arguments passed in from the host at kernel invocation time. This may include buffers on the host.
Traits: HasParent<AIE::DeviceOp>, NoTerminator
| Attribute | MLIR Type | Description |
|---|---|---|
sym_name | ::mlir::StringAttr | string attribute |
aiex.select (::xilinx::AIEX::SelectOp)A select operation
Syntax:
operation ::= `aiex.select` `(` $startHerd `,` $iterX `,` $iterY `)` attr-dict
This operation selects a group of tiles based on the selected indices.
Example:
%herd = AIE.herd[4][4] // a herd of 4x4 tiles
%ix = AIE.iter(0, 4, 1) // 0, 1, 2, 3 %iy = AIE.iter(0, 1, 1) // 0
%sub_herd = AIE.select(%herd, %ix, %iy)
The SelectOp in the above example will select the tiles %herd[0][0], %herd[1][0], %herd[2][0], %herd[3][0] (the first column of the herd).
| Operand | Description |
|---|---|
startHerd |
index |
iterX |
index |
iterY |
index |
| Result | Description |
|---|---|
| «unnamed» | index |
aiex.set_lock (::xilinx::AIEX::SetLockOp)Set the value of a lock
Syntax:
operation ::= `aiex.set_lock` `(` $lock `,` $value `)` attr-dict
This operation sets the value of lock inside of a RuntimeSequenceOp.
The operation is non blocking and does not offer any synchronization guarantees.
Should be used in combination with blocking operations.
Example:
%tile22 = aie.tile(2, 2)
%lock22_0 = aie.lock(%tile22, 0)
...
aiex.set_lock(%lock22_0, 5)
Traits: HasParent<RuntimeSequenceOp>, SkipAccessibilityCheckTrait
| Attribute | MLIR Type | Description |
|---|---|---|
value | ::mlir::IntegerAttr | 32-bit signless integer attribute |
| Operand | Description |
|---|---|
lock |
index |
aiex.token (::xilinx::AIEX::TokenOp)Declare a token (a logical lock)
Syntax:
operation ::= `aiex.token` `(` $value `)` attr-dict
This operation creates a logical lock. We use Symbol so that it can be referenced globally. Unlike phsical locks, logical locks are unlimited, and we can specify any integer value associated with a lock. The logical lock is used to manually specify the dependence of tasks, or core executions.
The operation can also be generated automatically if the Dependence Analysis can be leveraged.
Example: AIE.token(0) {sym_name = “token0”} // Declare token0 with initial value of 0
…
AIE.useToken @token0(“Acquire”, 0) // acquire token0 if its value is 0
…
AIE.useToken @token0(“Release”, 5) // release token0 and set its value to 5
Interfaces: Symbol
| Attribute | MLIR Type | Description |
|---|---|---|
value | ::mlir::IntegerAttr | 32-bit signless integer attribute |
aiex.useToken (::xilinx::AIEX::UseTokenOp)Acquire/release a logical lock
Syntax:
operation ::= `aiex.useToken` $tokenName `(` $action `,` $value `)` attr-dict
This operation uses token (logical lock). A logical lock can be acquired or released with a value. Similar to UseLockOp, this operation can be understood as “blocking” op.
| Attribute | MLIR Type | Description |
|---|---|---|
tokenName | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
value | ::mlir::IntegerAttr | 32-bit signless integer attribute |
action | xilinx::AIE::LockActionAttr | lock acquire/release |
AIEX type representing a block floating point type.
Syntax:
!aiex.bfp<
::llvm::StringRef # block_type
>
This is a type representing a block floating point. It is meant to eventually be lowered into a standard type further down the pipeline. It the meantime, it can be used for blocked fp related dataflow adaptations. Available types are v8bfp16ebs8 and v16bfp16ebs16.
| Parameter | C++ type | Description |
|---|---|---|
| block_type | ::llvm::StringRef |
AIE Architecture
| Symbol | Value | String |
|---|---|---|
| AIE1 | 1 |
AIE1 |
| AIE2 | 2 |
AIE2 |
| AIE2p | 3 |
AIE2p |
AIE Device
| Symbol | Value | String |
|---|---|---|
| xcvc1902 | 1 |
xcvc1902 |
| xcve2302 | 2 |
xcve2302 |
| xcve2802 | 3 |
xcve2802 |
| npu1 | 4 |
npu1 |
| npu1_1col | 5 |
npu1_1col |
| npu1_2col | 6 |
npu1_2col |
| npu1_3col | 7 |
npu1_3col |
| npu2 | 8 |
npu2 |
| npu2_1col | 9 |
npu2_1col |
| npu2_2col | 10 |
npu2_2col |
| npu2_3col | 11 |
npu2_3col |
| npu2_4col | 12 |
npu2_4col |
| npu2_5col | 13 |
npu2_5col |
| npu2_6col | 14 |
npu2_6col |
| npu2_7col | 15 |
npu2_7col |
Directions for cascade
| Symbol | Value | String |
|---|---|---|
| South | 3 |
South |
| West | 4 |
West |
| North | 5 |
North |
| East | 6 |
East |
DMA Channel direction
| Symbol | Value | String |
|---|---|---|
| S2MM | 0 |
S2MM |
| MM2S | 1 |
MM2S |
Lock acquire/release
| Symbol | Value | String |
|---|---|---|
| Acquire | 0 |
Acquire |
| AcquireGreaterEqual | 2 |
AcquireGreaterEqual |
| Release | 1 |
Release |
Lock operation is blocking
| Symbol | Value | String |
|---|---|---|
| NonBlocking | 0 |
NonBlocking |
| Blocking | 1 |
Blocking |
Ports of an object FIFO
| Symbol | Value | String |
|---|---|---|
| Produce | 0 |
Produce |
| Consume | 1 |
Consume |
Bundle of wires
| Symbol | Value | String |
|---|---|---|
| Core | 0 |
Core |
| DMA | 1 |
DMA |
| FIFO | 2 |
FIFO |
| South | 3 |
South |
| West | 4 |
West |
| North | 5 |
North |
| East | 6 |
East |
| PLIO | 7 |
PLIO |
| NOC | 8 |
NOC |
| Trace | 9 |
Trace |
| TileControl | 10 |
TileControl |