IRON dda37d5934b525fc5832d3d5e94037a8931956aa
Loading...
Searching...
No Matches
Functions
iron.algorithms.transform Namespace Reference

Functions

 _transform_gen (func, list inputs, output, *params, tile_size=16)
 
 _transform_parallel_gen (func, list inputs, output, *params, tile_size=16)
 
 transform (func, input, output, *params, tile_size=16)
 
 transform_binary (func, first, second, output, *params, tile_size=16)
 
 transform_parallel (func, input, output, *params, tile_size=16)
 
 transform_parallel_binary (func, first, second, output, *params, tile_size=16)
 

Detailed Description

Tiled transform algorithms (unary/binary, single-core/parallel) built on IRON.

Function Documentation

◆ _transform_gen()

iron.algorithms.transform._transform_gen (   func,
list  inputs,
  output,
params,
  tile_size = 16 
)
protected
General tiled transform to apply a function on inputs and obtain a single output.
Assumes all input and output shapes are the same.

Args:
    func: Function to apply, either a lambda/callable or ExternalFunction.
          For ExternalFunction, arg_types should be [*input_tiles, output_tile, *params]
    inputs: List of input tensors (will be tiled automatically)
    output: Output tensor (will be tiled automatically)
    *params: Additional parameters for ExternalFunction only.
             Scalar dtypes (np.int32, etc.) are passed as MLIR constants;
             array types are transferred via ObjectFifos.
    tile_size: Size of each tile processed by a worker (default: 16)

◆ _transform_parallel_gen()

iron.algorithms.transform._transform_parallel_gen (   func,
list  inputs,
  output,
params,
  tile_size = 16 
)
protected
General parallel transform to apply a function on inputs and obtain a single output.
Distributes work across multiple AIE tiles for parallel execution.

Args:
    func: Function to apply, either a lambda/callable or ExternalFunction.
          For ExternalFunction, arg_types should be [*input_tiles, output_tile, *params]
    inputs: List of input tensors (will be tiled automatically)
    output: Output tensor (will be tiled automatically)
    *params: Additional parameters for ExternalFunction only.
             Scalar dtypes (np.int32, etc.) are passed as MLIR constants;
             array types are transferred via ObjectFifos.
    tile_size: Size of each tile processed by a worker (default: 16)

◆ transform()

iron.algorithms.transform.transform (   func,
  input,
  output,
params,
  tile_size = 16 
)
Apply ``func`` to ``input`` and write results to ``output`` using tiled processing on a single AIE core.

Args:
    func: Function or :class:`~aie.iron.kernel.ExternalFunction` to apply.
    input: Input tensor (NPU-accessible).
    output: Output tensor (NPU-accessible, same shape and dtype as ``input``).
    *params: Additional parameters forwarded to ``func``.
    tile_size (int, optional): Number of elements per tile. Defaults to 16.

Returns:
    mlir.ir.Module: The compiled MLIR module.

◆ transform_binary()

iron.algorithms.transform.transform_binary (   func,
  first,
  second,
  output,
params,
  tile_size = 16 
)
Apply ``func`` to ``first`` and ``second`` and write results to ``output`` using tiled processing on a single AIE core.

Args:
    func: Function or :class:`~aie.iron.kernel.ExternalFunction` to apply.
    first: First input tensor (NPU-accessible).
    second: Second input tensor (NPU-accessible, same shape and dtype as ``first``).
    output: Output tensor (NPU-accessible, same shape and dtype as inputs).
    *params: Additional parameters forwarded to ``func``.
    tile_size (int, optional): Number of elements per tile. Defaults to 16.

Returns:
    mlir.ir.Module: The compiled MLIR module.

◆ transform_parallel()

iron.algorithms.transform.transform_parallel (   func,
  input,
  output,
params,
  tile_size = 16 
)
Apply ``func`` to ``input`` in parallel across all available NPU columns.

Distributes the input tensor evenly across columns; each column processes
``tile_size`` elements per iteration.

Args:
    func: Function or :class:`~aie.iron.kernel.ExternalFunction` to apply.
    input: Input tensor (NPU-accessible).
    output: Output tensor (NPU-accessible, same shape and dtype as ``input``).
    *params: Additional parameters forwarded to ``func``.
    tile_size (int, optional): Number of elements per tile per column. Defaults to 16.

Returns:
    mlir.ir.Module: The compiled MLIR module.

◆ transform_parallel_binary()

iron.algorithms.transform.transform_parallel_binary (   func,
  first,
  second,
  output,
params,
  tile_size = 16 
)
Apply ``func`` to ``first`` and ``second`` in parallel across all available NPU columns.

Args:
    func: Function or :class:`~aie.iron.kernel.ExternalFunction` to apply.
    first: First input tensor (NPU-accessible).
    second: Second input tensor (NPU-accessible, same shape and dtype as ``first``).
    output: Output tensor (NPU-accessible, same shape and dtype as inputs).
    *params: Additional parameters forwarded to ``func``.
    tile_size (int, optional): Number of elements per tile per column. Defaults to 16.

Returns:
    mlir.ir.Module: The compiled MLIR module.