-aie-assign-bd-ids

Assign bd ids to aie.dma_bd ops.

-aie-assign-buffer-addresses

Assign memory locations for buffers in each tile

Buffers in a tile generally have an address that does not significantly matter in the design. Hence, most of the time we can instantiate aie.buffer operations without an address. This pass determines updates each aie.buffer operation without an address to have a well-defined address. This enables later passes to have a consistent view of the memory map of a system.

-aie-assign-lock-ids

Assigns the lockIDs of locks that do not have IDs.

Assigns the lockIDs of locks that do not have IDs.

-aie-canonicalize-device

Canonicalize Designs to include a toplevel device

This pass inserts a toplevel device operation in designs that do not have one. This allows us to support backwards compatability for older models targetting the VC1902 device without explicit device operations.

-aie-create-packet-flows

Route aie.packetflow operations through switchboxes

Replace each aie.packetflow operation with an equivalent set of aie.switchbox and aie.wire operations.

-aie-create-pathfinder-flows

Route aie.flow operations through switchboxes with Pathfinder algorithm

Replace each aie.flow operation with an equivalent set of aie.switchbox and aie.wire operations. Uses Pathfinder congestion-aware algorithm.

-aie-find-flows

Recover flows from switchbox configuration

Under normal circumstances, every configured aie.switchbox operation should contribute to describing an end-to-end flow from one point to another. These flows may be circuit-switched flows (represented by aie.flow) or a packet-switched connection (represensted by aie.packetflow). This pass is primarily used for testing automatic routing.

-aie-localize-locks

Convert global locks to a core-relative index

An individual lock can be referenced by 4 different AIE cores. However, each individual core accesses the lock through a different ‘lock address space’. This pass converts a lock in the conceptual global address space into a local index. e.g.:

%lock = AIE.lock(%tile, 2)
AIE.core(%tile) {
  AIE.useLock(%lock, "Acquire", 1)
}

becomes

AIE.core(%tile) {
  %lockindex = arith.constant ? : index
  AIE.useLock(%lockindex, "Acquire", 1)
}

-aie-lower-cascade-flows

Lower aie.cascade_flow operations through aie.configure_cascade operations

Replace each aie.cascade_flow operation with an equivalent set of aie.configure_cascade operations.

-aie-normalize-address-spaces

Remove non-default address spaces

Early in the flow, it is convenient to represent multiple memories using different address spaces. However, after outlining code for AIE engines, the core itself only has access to a single address space. To avoid confusion, this pass normalizes any address spaces remaining in the code, converting them to the default address space.

-aie-objectFifo-stateful-transform

Instantiate the buffers and locks of aie.objectFifo.createObjectFifo operations

Replace each aie.objectFifo.createObjectFifo operation with aie.buffer and aie.lock operations in the producer tile. Convert aie.objectFifo.acquire, aie.objectFifo.release and aie.objectFifo.subviewAccess operations into useLock operations by keeping track of acquire/release operations on each objectFifo by each process.

If the producer and consumer tiles of an aie.objectFifo.createObjectFifo operation are not adjacent, the pass also establised aie.flow and aie.dma operations to enable communication between the tiles. Extend the body of each loop that contains operations on objectFifos such that it is unrolled based on the number of elements in the objectFifos. If the number of iterations of the loop cannot be divided pefectly by the unrolling factor, the pass duplicates the loop body after the original loop.

-aie-register-objectFifos

Generate acquire/release patterns for producer/consumer processes registered to an objectFifo

Generate acquire/release patterns in the CoreOps of associated cores for each aie.objectfifo.register_process operation. Patterns are generated as for loops of different sizes depending on input patterns.

-aie-standard-lowering

Lowering operations in AIE cores’ regions to Standard

Outline code inside AIE.core operations into the llvm dialect. BufferOp operations are converted to a GlobalMemrefOp and references to those buffers are converted to GetGlobalMemrefOp. Other AIE operations inside the cores are generally lowered to appropriate function intrinsics. Other AIE operations (e.g. CoreOp, TileOp, LockOp) outside the core are removed.

Optionally, tileCol and tileRow can specify a single core to export

Options

-tilecol : X coordinate of tile to generate code for
-tilerow : Y coordinate of tile to generate code for

-aie-vector-opt

Optimize vector instructions for AIE

After super-vectorization, some additional optimizations are important for improving QOR and enabling lowering to LLVM.