-aie-assign-bd-ids

Assign bd ids to aie.dma_bd ops.

-aie-assign-buffer-addresses

Assign memory locations for buffers in each tile

Buffers in a tile generally have an address that does not significantly matter in the design. Hence, most of the time we can instantiate aie.buffer operations without an address. This pass determines updates each aie.buffer operation without an address to have a well-defined address. This enables later passes to have a consistent view of the memory map of a system.

Options

-basic-alloc : Flag to enable the basic sequential allocation scheme (not bank-aware).

-aie-assign-lock-ids

Assigns the lockIDs of locks that do not have IDs.

Assigns the lockIDs of locks that do not have IDs.

-aie-assign-tile-controller-ids

Assign controller id per aie.tile_op

For each aie.tile_op used in the design, assign a unique controller ID.

Options

-column-wise-unique-ids : Flag to generate controller ids only unique within each column. Otherwise globally unique.

-aie-canonicalize-device

Canonicalize Designs to include a toplevel device

This pass inserts a toplevel device operation in designs that do not have one. This allows us to support backwards compatability for older models targetting the VC1902 device without explicit device operations.

-aie-create-pathfinder-flows

Route aie.flow and aie.packetflow operations through switchboxes

Uses Pathfinder congestion-aware algorithm. Each aie.flow is replaced with aie.connect operation. Each aie.packetflow is replace with the set of aie.amsel, aie.masterset and aie.packet_rules operations.

Options

-route-circuit : Flag to enable aie.flow lowering.
-route-packet  : Flag to enable aie.packetflow lowering.

-aie-find-flows

Recover flows from switchbox configuration

Under normal circumstances, every configured aie.switchbox operation should contribute to describing an end-to-end flow from one point to another. These flows may be circuit-switched flows (represented by aie.flow) or a packet-switched connection (represensted by aie.packetflow). This pass is primarily used for testing automatic routing.

-aie-generate-column-control-overlay

Spawns streaming interconnect network for column control

For each column of AIE tiles being employed in the design, spawn a network of control packet streaming interconnects which overlay on top of the design.

Options

-route-shim-to-tct       : Flag to generate TCT routing between tile CTRL and shim SOUTH ports. Available options: ['shim-only', 'all-tiles', 'disable'].
-route-shim-to-tile-ctrl : Flag to generate routing between shim dma DMA and tile CTRL ports, for configuration.

-aie-localize-locks

Convert global locks to a core-relative index

An individual lock can be referenced by 4 different AIE cores. However, each individual core accesses the lock through a different ‘lock address space’. This pass converts a lock in the conceptual global address space into a local index. e.g.:

%lock = AIE.lock(%tile, 2)
AIE.core(%tile) {
  AIE.useLock(%lock, "Acquire", 1)
}

becomes

AIE.core(%tile) {
  %lockindex = arith.constant ? : index
  AIE.useLock(%lockindex, "Acquire", 1)
}

-aie-lower-cascade-flows

Lower aie.cascade_flow operations through aie.configure_cascade operations

Replace each aie.cascade_flow operation with an equivalent set of aie.configure_cascade operations.

-aie-normalize-address-spaces

Remove non-default address spaces

Early in the flow, it is convenient to represent multiple memories using different address spaces. However, after outlining code for AIE engines, the core itself only has access to a single address space. To avoid confusion, this pass normalizes any address spaces remaining in the code, converting them to the default address space.

-aie-objectFifo-stateful-transform

Instantiate the buffers and locks of aie.objectFifo.createObjectFifo operations

Replace each aie.objectFifo.createObjectFifo operation with aie.buffer and aie.lock operations in the producer tile. Convert aie.objectFifo.acquire, aie.objectFifo.release and aie.objectFifo.subviewAccess operations into useLock operations by keeping track of acquire/release operations on each objectFifo by each process.

If the producer and consumer tiles of an aie.objectFifo.createObjectFifo operation are not adjacent, the pass also establised aie.flow and aie.dma operations to enable communication between the tiles. Extend the body of each loop that contains operations on objectFifos such that it is unrolled based on the number of elements in the objectFifos. If the number of iterations of the loop cannot be divided pefectly by the unrolling factor, the pass duplicates the loop body after the original loop.

Options

-dynamic-objFifos : Flag to enable dynamic object fifo lowering in cores instead of loop unrolling.

-aie-register-objectFifos

Generate acquire/release patterns for producer/consumer processes registered to an objectFifo

Generate acquire/release patterns in the CoreOps of associated cores for each aie.objectfifo.register_process operation. Patterns are generated as for loops of different sizes depending on input patterns.

-aie-standard-lowering

Lowering operations in AIE cores’ regions to Standard

Outline code inside AIE.core operations into the llvm dialect. BufferOp operations are converted to a GlobalMemrefOp and references to those buffers are converted to GetGlobalMemrefOp. Other AIE operations inside the cores are generally lowered to appropriate function intrinsics. Other AIE operations (e.g. CoreOp, TileOp, LockOp) outside the core are removed.

Optionally, tileCol and tileRow can specify a single core to export

Options

-tilecol : X coordinate of tile to generate code for
-tilerow : Y coordinate of tile to generate code for

-aie-vector-opt

Optimize vector instructions for AIE

After super-vectorization, some additional optimizations are important for improving QOR and enabling lowering to LLVM.