transform.air.herd_vectorize (transform::AIRHerdVectorizeOp)

Vectorize operations inside air.herd operations

Syntax:

operation ::= `transform.air.herd_vectorize` $target attr-dict

This transform takes a handle to air.herd operations and vectorizes the operations inside their bodies using the same logic as the AIRHerdVectorizePass. It walks the body of each herd operation and applies vectorization patterns to linalg operations and other vectorizable operations.

The transform supports the same options as the AIRHerdVectorizePass:

  • vectorize_nd_extract: Controls whether to vectorize tensor.extract when the input tensor is rank >= 2
  • flatten_1d_depthwise_conv: Controls whether to “flatten” the channel dimension when vectorizing 1D depthwise convolutions
  • disable_transfer_permutation_map_lowering_patterns: Disables vector transfer permutation map lowering patterns
  • disable_multi_reduction_to_contract_patterns: Disables multi-reduction to contract patterns
  • vectorize_padding: Enables vectorization of padding operations

Example:

%herd = transform.structured.match ops{["air.herd"]} in %f : (!pdl.operation) -> !pdl.operation
%vectorized = transform.air.herd_vectorize %herd {
  vectorize_nd_extract = false,
  flatten_1d_depthwise_conv = false,
  vectorize_padding = true
} : (!pdl.operation) -> !pdl.operation

Returns a handle to the transformed air.herd operations.

Traits: FunctionalStyleTransformOpTrait, TransformEachOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Attributes:

AttributeMLIR TypeDescription
vectorize_nd_extract::mlir::BoolAttrbool attribute
flatten_1d_depthwise_conv::mlir::BoolAttrbool attribute
disable_transfer_permutation_map_lowering_patterns::mlir::BoolAttrbool attribute
disable_multi_reduction_to_contract_patterns::mlir::BoolAttrbool attribute
vectorize_padding::mlir::BoolAttrbool attribute

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *

transform.air.hoist_static_alloc (transform::AIRHoistStaticAllocOp)

Hoist static allocations.

Syntax:

operation ::= `transform.air.hoist_static_alloc` $target attr-dict `:` functional-type(operands, results)

Moves certain statically-sized memref.alloc operations from inner blocks to the entry block of the target function. This shortens and unifies buffer lifetimes, which can unlock reuse and downstream optimizations.

Notes / limitations

  • Currently targets memref.alloc buffers with static shapes.
  • Uses that require exact type equality across region boundaries (e.g., scf.yield, func.return) are not rewritten; such allocations are skipped.
  • Hoisting increases the buffer’s lifetime; apply with care on large buffers.

Example

Before:

func.func @foo(%arg0: memref<64xi32>) {
  scf.for %i = %c0 to %c4 step %c1 {
    %tmp = memref.alloc() : memref<64xi32>
    linalg.fill ins(%cst : i32) outs(%tmp : memref<64xi32>)
    memref.dealloc %tmp : memref<64xi32>
  }
  return
}

After:

func.func @foo(%arg0: memref<64xi32>) {
  %tmp.hoisted = memref.alloc() : memref<64xi32>
  scf.for %i = %c0 to %c4 step %c1 {
    linalg.fill ins(%cst : i32) outs(%tmp.hoisted : memref<64xi32>)
  }
  memref.dealloc %tmp.hoisted : memref<64xi32>
  return
}

Usage (Transform dialect)

transform.sequence %arg0 : !pdl.operation failures(propagate) {
^bb0(%f: !pdl.operation):
  transform.air.hoist_static_alloc %f
    : (!pdl.operation) -> ()
}

Traits: ReportTrackingListenerFailuresOpTrait, TransformEachOpTrait

Interfaces: MemoryEffectOpInterface, TransformOpInterface

Operands:

Operand Description
target PDL handle to an mlir::Operation *

transform.air.convert_memref_copy_to_linalg_copy (transform::ConvertMemrefCopyToLinalgCopyOp)

Convert memref.copy operations to linalg.copy operations

Syntax:

operation ::= `transform.air.convert_memref_copy_to_linalg_copy` $target attr-dict

This transform converts memref.copy operations to linalg.copy operations. This can be useful for enabling further linalg-based optimizations and transformations.

The transformation replaces:

memref.copy %source, %dest : memref<...> to memref<...>

With:

linalg.copy ins(%source : memref<...>) outs(%dest : memref<...>)

Returns a handle to the modified operation containing the transformed copies.

Traits: FunctionalStyleTransformOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *

transform.air.copy_to_dma (transform::CopyToDmaOp)

Syntax:

operation ::= `transform.air.copy_to_dma` $target attr-dict

Transform a memref.copy operation into a air.dma_memcpy_nd operation. Returns the new air.dma_memcpy_nd operation.

Traits: FunctionalStyleTransformOpTrait, TransformEachOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *

transform.air.eliminate_cascade_memcpy (transform::EliminateCascadeMemcpyOp)

Eliminate intermediate memref buffers in cascaded DMA operations

Syntax:

operation ::= `transform.air.eliminate_cascade_memcpy` $target attr-dict

This transform identifies and eliminates intermediate memref buffers in cascaded air.dma_memcpy_nd operations. It looks for the pattern where an intermediate buffer is used exactly twice: once as the destination of a DMA operation and once as the source of another DMA operation, with both operations using default access patterns (empty offsets, sizes, and strides).

The transformation replaces:

air.dma_memcpy_nd (%intermediate[] [] [], %source[] [] []) : (memref<...>, memref<...>)
air.dma_memcpy_nd (%dest[] [] [], %intermediate[] [] []) : (memref<...>, memref<...>)

With:

air.dma_memcpy_nd (%dest[] [] [], %source[] [] []) : (memref<...>, memref<...>)

This optimization eliminates unnecessary intermediate memory allocations and reduces memory traffic, which is particularly beneficial for cascade patterns in AIR programs.

Returns a handle to the modified operation.

Traits: FunctionalStyleTransformOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *

transform.air.forall_with_reduce_to_parallel (transform::ForallWithReduceToParallelOp)

Converts a pattern of scf.forall and linalg.reduce to scf.parallel

Syntax:

operation ::= `transform.air.forall_with_reduce_to_parallel` $target attr-dict `:` functional-type(operands, results)

.

Traits: FunctionalStyleTransformOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
transformed variadic of PDL handle to an mlir::Operation *

transform.air.fuse_extf_linalg (transform::FuseExtfLinalgOp)

Fuse a linalg operation containing only arith.extf with its consumer

Syntax:

operation ::= `transform.air.fuse_extf_linalg` $first_op `,` $second_op attr-dict

This transform fuses two linalg operations where:

  1. The first operation contains only an arith.extf operation in its body (apart from terminator)
  2. The second operation directly consumes the result of the first operation

The fusion is performed by:

  1. Removing the arith.extf from the first operation
  2. Updating the input type in the second operation to use the original (narrower) type
  3. Adding arith.extf operations as needed to maintain type consistency
  4. Erasing the first operation

This optimization folds the arithmetic extensions into the linalg ops, and enables the use of native native intrinsics on narrower datatypes, such as AMD AIEs.

Example:

// Before fusion:
%0 = linalg.generic {
  ^bb0(%arg0: f16):
    %1 = arith.extf %arg0 : f16 to f32
    linalg.yield %1 : f32
} ins(%input : tensor<16xf16>) outs(%temp : tensor<16xf32>)

%result = linalg.generic {
  ^bb0(%arg0: f32, %arg1: f32):
    %2 = arith.addf %arg0, %arg1 : f32
    linalg.yield %2 : f32
} ins(%0, %other : tensor<16xf32>, tensor<16xf32>) outs(%output : tensor<16xf32>)

// After fusion:
%result = linalg.generic {
  ^bb0(%arg0: f16, %arg1: f32):
    %1 = arith.extf %arg0 : f16 to f32
    %2 = arith.addf %1, %arg1 : f32
    linalg.yield %2 : f32
} ins(%input, %other : tensor<16xf16>, tensor<16xf32>) outs(%output : tensor<16xf32>)

Returns a handle to the fused operation (the second operation after modification).

Traits: FunctionalStyleTransformOpTrait

Interfaces: MemoryEffectOpInterface, TransformOpInterface

Operands:

Operand Description
first_op PDL handle to an mlir::Operation *
second_op PDL handle to an mlir::Operation *

Results:

Result Description
fused_op PDL handle to an mlir::Operation *

transform.air.fuse_into_containing_op (transform::FuseIntoContainingMemrefOp)

Fuse a producer into a containing operation.

Syntax:

operation ::= `transform.air.fuse_into_containing_op` $producer_op `into` $containing_op attr-dict

Fuses the producer_op into the containing_op. Returns a handle to the fused ops.

The producer is a subview slice of a tiled op. This transform computes the accessed producer slice inside of the containing op (“tile and fuse”).

The containing op handle must be associated with exactly one payload op. The producer op handle may be associated with multiple payload ops. This transform fuses exactly one producer.

Return modes

If the producer could not be fused, this operation fails silently. This is the case when tiling fails or when the producer op has zero uses within the containing op. I.e., “producers” that are not consumed within the containing op are rejected by this operation.

This operation reads and frees the producer handle. This operation reads the containing op handle.

Interfaces: MemoryEffectOpInterface, TransformOpInterface

Operands:

Operand Description
producer_op PDL handle to an mlir::Operation *
containing_op PDL handle to an mlir::Operation *

Results:

Result Description
fused_op PDL handle to an mlir::Operation *

transform.air.fuse_truncf_linalg (transform::FuseTruncfLinalgOp)

Fuse a linalg operation containing only arith.truncf into its producer

Syntax:

operation ::= `transform.air.fuse_truncf_linalg` $truncf_op `,` $producer_op attr-dict

This transform fuses two linalg operations where:

  1. The truncf operation contains only an arith.truncf operation in its body (apart from terminator)
  2. The producer operation produces a result that is consumed by the truncf operation

The fusion is performed by:

  1. Taking the producer operation’s body
  2. Adding arith.truncf operation before the terminator
  3. Updating the output type to use the truncated (narrower) type
  4. Erasing both the original truncf operation and producer operation

This optimization folds the arithmetic truncations into the producer linalg ops, enabling the use of native intrinsics on narrower datatypes, such as AMD AIEs, and reducing intermediate memory storage requirements.

Example:

// Before fusion:
%0 = linalg.generic {
  ^bb0(%arg0: f32, %arg1: f32):
    %1 = arith.addf %arg0, %arg1 : f32
    linalg.yield %1 : f32
} ins(%input1, %input2 : tensor<16xf32>, tensor<16xf32>) outs(%temp : tensor<16xf32>)

%result = linalg.generic {
  ^bb0(%arg0: f32):
    %2 = arith.truncf %arg0 : f32 to f16
    linalg.yield %2 : f16
} ins(%0 : tensor<16xf32>) outs(%output : tensor<16xf16>)

// After fusion:
%result = linalg.generic {
  ^bb0(%arg0: f32, %arg1: f32):
    %1 = arith.addf %arg0, %arg1 : f32
    %2 = arith.truncf %1 : f32 to f16
    linalg.yield %2 : f16
} ins(%input1, %input2 : tensor<16xf32>, tensor<16xf32>) outs(%output : tensor<16xf16>)

Returns a handle to the fused operation (the producer operation after modification).

Traits: FunctionalStyleTransformOpTrait

Interfaces: MemoryEffectOpInterface, TransformOpInterface

Operands:

Operand Description
truncf_op PDL handle to an mlir::Operation *
producer_op PDL handle to an mlir::Operation *

Results:

Result Description
fused_op PDL handle to an mlir::Operation *

transform.air.get_segment_for (transform::GetSegmentForOp)

Gets a handle to the parent ‘air.segment’ of the given operation

Syntax:

operation ::= `transform.air.get_segment_for` $target attr-dict

Produces a handle to the parent air.segment op for each payload IR operation associated with the operand. Fails if a segment cannot be found. The list of operations associated with the handle contains parent operations in the same order as the list associated with the operand, except for operations that are parents to more than one input which are only present once.

Traits: NavigationTransformOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
parent PDL handle to an mlir::Operation *

transform.air.linalg_promote (transform::LinalgPromoteOp)

Syntax:

operation ::= `transform.air.linalg_promote` $target attr-dict

Promotes the specified operands of the target into a separate memory buffer using the mlir::linalg::promoteSubViews utility.

This operation applies to Linalg ops that satisfy the mlir::linalg::promoteSubviewsPrecondition, otherwise it fails.

When successful, several optimization passes are run on the resulting IR. The return handle points to the target operation that was modified inplace.

The operation accepts as attributes the fields in mlir::linalg::LinalgPromotionOptions. In addition the memory space in allocated buffers can be specified with with the memory_space attribute as “L1”, “L2” or “L3”. The default memory space is L1.

example:

%0 = transform.structured.match ops{["linalg.matmul"]} in %code  : (!pdl.operation) -> !pdl.operation
%1 = transform.air.linalg_promote %0 {memory_space="L2", operands_to_promote=[0]}

The group_size attribute is used to apply promotion to multiple linalg ops. When group_size=N, the operands_to_promote attribute refers to N payload operations at a time and the operand indices apply to the operands of the N operations in the order they appear in the target handle.

For example,

%m = transform.structured.match ops{["linalg.matmul"]} in %f : (!pdl.operation) -> !pdl.operation
%f = transform.structured.match ops{["linalg.fill"]} in %f : (!pdl.operation) -> !pdl.operation
%h = transform.merge_handles %f, %m : !pdl.operation
// promote the input of the fill operation and the output of the matmul operation to L1 memory
transform.air.linalg_promote %h {"group_size"=2, "operands_to_promote"=[1,4], "memory_space"="L1"}

Interfaces: MemoryEffectOpInterface, TransformOpInterface

Attributes:

AttributeMLIR TypeDescription
operands_to_promote::mlir::ArrayAttr64-bit integer array attribute
group_size::mlir::IntegerAttr64-bit signless integer attribute
use_full_tile_buffers::mlir::ArrayAttr1-bit boolean array attribute
use_full_tiles_by_default::mlir::UnitAttrunit attribute
use_alloca::mlir::UnitAttrunit attribute
alignment::mlir::IntegerAttr64-bit signless integer attribute
memory_space::mlir::StringAttrstring attribute

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
transformed PDL handle to an mlir::Operation *

transform.air.linalg_tile (transform::LinalgTileOp)

Tile a linalg operation with the given sizes. The new linalg operantion and the generated loop are returned. Tiling is performed with the transform::tileToForallOpImpl so that an scf.forall loop is generated whenever possible.

This is a variant of transform.structured.tile_using_forall.

Interfaces: MemoryEffectOpInterface, TransformOpInterface

Attributes:

AttributeMLIR TypeDescription
static_sizes::mlir::DenseI64ArrayAttri64 dense array attribute

Operands:

Operand Description
target PDL handle to an mlir::Operation *
dynamic_sizes variadic of PDL handle to an mlir::Operation *

Results:

Result Description
tiled_linalg_op PDL handle to an mlir::Operation *
loops PDL handle to an mlir::Operation *

transform.air.linalg_to_library_call (transform::LinalgToLibraryCallOp)

Convert a linalg op to a function call (library call)

Syntax:

operation ::= `transform.air.linalg_to_library_call` $target attr-dict `:` functional-type(operands, results)

Replaces a linalg op with a call to a function. If the function_name attribute is provided, it is used as the function name. Otherwise, the linalg op’s library_call attribute is used. The function is created if it does not exist. If the link_with attribute is provided, it is used to link the function call to a prebuilt object that contains the implementation of the function. If the linalg op is inside a herd, the link_with attribute is propagated to the herd.

Example:

%matmul = transform.structured.match ops{["linalg.matmul"]} in %f : (!pdl.operation) -> !pdl.operation
%call = transform.air.linalg_to_library_call %matmul { function_name = "my_matmul", link_with = "extern_func.o" } : (!pdl.operation) -> !pdl.operation

Traits: FunctionalStyleTransformOpTrait, TransformEachOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Attributes:

AttributeMLIR TypeDescription
function_name::mlir::StringAttrstring attribute
link_with::mlir::StringAttrstring attribute

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *

transform.air.par_to_herd (transform::ParToHerdOp)

Syntax:

operation ::= `transform.air.par_to_herd` $target attr-dict

Transform a scf.parallel operation into a air.herd operation. If the scf.parallel operation has more than two dimensions, then only the last two are used and a new scf.parallel is created outside of the herd. Returns the new air.herd operation.

Traits: FunctionalStyleTransformOpTrait, TransformEachOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Attributes:

AttributeMLIR TypeDescription
first_dim::mlir::IntegerAttr64-bit signless integer attribute

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *

transform.air.par_to_launch (transform::ParToLaunchOp)

Syntax:

operation ::= `transform.air.par_to_launch` $target attr-dict

Transform a scf.parallel operation into a air.launch operation. Returns the new air.launch operation.

Traits: FunctionalStyleTransformOpTrait, TransformEachOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Attributes:

AttributeMLIR TypeDescription
has_air_segment::mlir::BoolAttrbool attribute

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *

transform.air.par_to_segment (transform::ParToSegmentOp)

Syntax:

operation ::= `transform.air.par_to_segment` $target attr-dict

Transform a scf.parallel operation into a air.segment operation. Returns the new air.segment operation.

Traits: FunctionalStyleTransformOpTrait, TransformEachOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Attributes:

AttributeMLIR TypeDescription
has_air_segment::mlir::BoolAttrbool attribute

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *

transform.air.pipeline_reduce (transform::PipelineReduceOp)

Syntax:

operation ::= `transform.air.pipeline_reduce` $target attr-dict

Experimental

Traits: FunctionalStyleTransformOpTrait, TransformEachOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Attributes:

AttributeMLIR TypeDescription
tile_size::mlir::ArrayAttr64-bit integer array attribute
pipeline_depth::mlir::IntegerAttr64-bit signless integer attribute
direction::mlir::StringAttrstring attribute
promote::mlir::UnitAttrunit attribute

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *

transform.air.remove_uninitialized_copy (transform::RemoveUninitializedCopyOp)

Remove copy operations that copy from uninitialized memrefs

Syntax:

operation ::= `transform.air.remove_uninitialized_copy` $target attr-dict

This transform walks through a func.func operation and identifies memref.copy and linalg.copy operations where the source is an uninitialized memref (allocated but not written to). Such copy operations are erased as they copy undefined data.

The transform detects the pattern where:

  1. A memref is allocated with memref.alloc
  2. A subview of that memref is created (optional)
  3. The memref/subview is used as source in memref.copy or linalg.copy before any write operations

Returns a handle to the modified function.

Examples:

// memref.copy case
%alloc = memref.alloc() : memref<2x16x8xi32, 1>
%subview = memref.subview %alloc[0, 0, 0] [1, 16, 8] [1, 1, 1] : ...
%target = memref.alloc() : memref<1x16x8xi32, 2>
memref.copy %subview, %target  // <- This copy will be erased

// linalg.copy case
%alloc2 = memref.alloc() : memref<16x8xi32, 1>
%target2 = memref.alloc() : memref<16x8xi32, 2>
linalg.copy ins(%alloc2 : memref<16x8xi32, 1>) outs(%target2 : memref<16x8xi32, 2>)  // <- This copy will be erased

Traits: FunctionalStyleTransformOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *

transform.air.segment_to_aie (transform::SegmentToAIEOp)

Syntax:

operation ::= `transform.air.segment_to_aie` $target attr-dict

Lower air.segment operations to mlir-aie modules.

Traits: FunctionalStyleTransformOpTrait, TransformEachOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
transformed PDL handle to an mlir::Operation *

transform.air.transpose_reduce (transform::TransposeReduceOp)

Transpose inputs of linalg.reduce ops to make reduction dimensions innermost

Syntax:

operation ::= `transform.air.transpose_reduce` $target attr-dict

This transform takes a handle to linalg.reduce operations and checks if the reduction dimensions are at the innermost (last/lowest) dimensions. If any reduction dimension has non-reduction dimensions to the right, it transposes the corresponding inputs to ensure all reduction dimensions are innermost.

For example, if a linalg.reduce operation reduces along dimension 1 in a 3D tensor (shape [M, N, K] reducing along N), this transform will transpose the input to [M, K, N] so that the reduction dimension N becomes innermost.

This optimization is beneficial for hardware accelerators that perform more efficient reductions when the reduction dimensions are contiguous and innermost.

The transformation:

  1. Analyzes each linalg.reduce operation’s reduction dimensions
  2. Determines if any reduction dimension has non-reduction dimensions to its right
  3. If so, creates a transpose operation to move reduction dimensions to the end
  4. Updates the linalg.reduce operation to work with the transposed input
  5. Optionally transposes the output back to the original layout if needed

Returns a handle to the transformed linalg.reduce operations.

Traits: FunctionalStyleTransformOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *

transform.air.vector_type_cast (transform::VectorTypeCastOp)

Cast vector operands and results of vector operations to a user-provided datatype

Syntax:

operation ::= `transform.air.vector_type_cast` $target attr-dict

This transform takes a handle to vector dialect operations and casts input operands and/or results of vector type to a user-provided datatype. By default, if none of input_indices or output_indices are specified, all vector operands and results are cast.

The transformation works by:

  1. Finding vector dialect operations within the target
  2. For each vector operation, examining its operands and results
  3. Creating cast operations to convert selected vector operands to the target element type
  4. Updating the operation to work with the new vector types
  5. Creating cast operations to convert selected results back to the original types

This optimization is useful for hardware accelerators that can perform vector operations natively on specific data types (e.g., bf16, f16) while maintaining compatibility with the original precision through selective casting.

Example 1 - Cast all inputs and outputs (default behavior):

// Before:
%result = vector.fma %a, %b, %c : vector<8xf32>

// After (with target_element_type = f16):
%a_cast = arith.truncf %a : vector<8xf32> to vector<8xf16>
%b_cast = arith.truncf %b : vector<8xf32> to vector<8xf16>
%c_cast = arith.truncf %c : vector<8xf32> to vector<8xf16>
%result_f16 = vector.fma %a_cast, %b_cast, %c_cast : vector<8xf16>
%result = arith.extf %result_f16 : vector<8xf16> to vector<8xf32>

Example 2 - Cast only specific inputs:

// Before:
%result = vector.fma %a, %b, %c : vector<8xf32>

// After (with target_element_type = f16, input_indices = [0, 1]):
%a_cast = arith.truncf %a : vector<8xf32> to vector<8xf16>
%b_cast = arith.truncf %b : vector<8xf32> to vector<8xf16>
%result_f16 = vector.fma %a_cast, %b_cast, %c : vector<8xf16, f32, f32>
%result = arith.extf %result_f16 : vector<8xf16> to vector<8xf32>

Example 3 - Cast only outputs:

// Transform only the output
transform.air.vector_type_cast %op {
  target_element_type = f16,
  output_indices = [0]
}

Attributes:

  • target_element_type: The element type to cast to (required). Supported types include f16, bf16, f32, f64, i8, i16, i32, i64.
  • input_indices: Optional array of input operand indices to cast. If empty, all vector inputs are cast.
  • output_indices: Optional array of output result indices to cast. If empty, all vector results are cast.

Returns a handle to the modified operations containing the transformed vector operations.

Traits: FunctionalStyleTransformOpTrait

Interfaces: MemoryEffectsOpInterface, TransformOpInterface

Attributes:

AttributeMLIR TypeDescription
target_element_type::mlir::TypeAttrany type attribute
input_indices::mlir::ArrayAttr64-bit integer array attribute
output_indices::mlir::ArrayAttr64-bit integer array attribute

Operands:

Operand Description
target PDL handle to an mlir::Operation *

Results:

Result Description
result PDL handle to an mlir::Operation *