|
IRON dda37d5934b525fc5832d3d5e94037a8931956aa
|
Functions | |
| _transform_gen (func, list inputs, output, *params, tile_size=16) | |
| _transform_parallel_gen (func, list inputs, output, *params, tile_size=16) | |
| transform (func, input, output, *params, tile_size=16) | |
| transform_binary (func, first, second, output, *params, tile_size=16) | |
| transform_parallel (func, input, output, *params, tile_size=16) | |
| transform_parallel_binary (func, first, second, output, *params, tile_size=16) | |
Tiled transform algorithms (unary/binary, single-core/parallel) built on IRON.
|
protected |
General tiled transform to apply a function on inputs and obtain a single output.
Assumes all input and output shapes are the same.
Args:
func: Function to apply, either a lambda/callable or ExternalFunction.
For ExternalFunction, arg_types should be [*input_tiles, output_tile, *params]
inputs: List of input tensors (will be tiled automatically)
output: Output tensor (will be tiled automatically)
*params: Additional parameters for ExternalFunction only.
Scalar dtypes (np.int32, etc.) are passed as MLIR constants;
array types are transferred via ObjectFifos.
tile_size: Size of each tile processed by a worker (default: 16)
|
protected |
General parallel transform to apply a function on inputs and obtain a single output.
Distributes work across multiple AIE tiles for parallel execution.
Args:
func: Function to apply, either a lambda/callable or ExternalFunction.
For ExternalFunction, arg_types should be [*input_tiles, output_tile, *params]
inputs: List of input tensors (will be tiled automatically)
output: Output tensor (will be tiled automatically)
*params: Additional parameters for ExternalFunction only.
Scalar dtypes (np.int32, etc.) are passed as MLIR constants;
array types are transferred via ObjectFifos.
tile_size: Size of each tile processed by a worker (default: 16)
| iron.algorithms.transform.transform | ( | func, | |
| input, | |||
| output, | |||
| * | params, | ||
tile_size = 16 |
|||
| ) |
Apply ``func`` to ``input`` and write results to ``output`` using tiled processing on a single AIE core.
Args:
func: Function or :class:`~aie.iron.kernel.ExternalFunction` to apply.
input: Input tensor (NPU-accessible).
output: Output tensor (NPU-accessible, same shape and dtype as ``input``).
*params: Additional parameters forwarded to ``func``.
tile_size (int, optional): Number of elements per tile. Defaults to 16.
Returns:
mlir.ir.Module: The compiled MLIR module.
| iron.algorithms.transform.transform_binary | ( | func, | |
| first, | |||
| second, | |||
| output, | |||
| * | params, | ||
tile_size = 16 |
|||
| ) |
Apply ``func`` to ``first`` and ``second`` and write results to ``output`` using tiled processing on a single AIE core.
Args:
func: Function or :class:`~aie.iron.kernel.ExternalFunction` to apply.
first: First input tensor (NPU-accessible).
second: Second input tensor (NPU-accessible, same shape and dtype as ``first``).
output: Output tensor (NPU-accessible, same shape and dtype as inputs).
*params: Additional parameters forwarded to ``func``.
tile_size (int, optional): Number of elements per tile. Defaults to 16.
Returns:
mlir.ir.Module: The compiled MLIR module.
| iron.algorithms.transform.transform_parallel | ( | func, | |
| input, | |||
| output, | |||
| * | params, | ||
tile_size = 16 |
|||
| ) |
Apply ``func`` to ``input`` in parallel across all available NPU columns.
Distributes the input tensor evenly across columns; each column processes
``tile_size`` elements per iteration.
Args:
func: Function or :class:`~aie.iron.kernel.ExternalFunction` to apply.
input: Input tensor (NPU-accessible).
output: Output tensor (NPU-accessible, same shape and dtype as ``input``).
*params: Additional parameters forwarded to ``func``.
tile_size (int, optional): Number of elements per tile per column. Defaults to 16.
Returns:
mlir.ir.Module: The compiled MLIR module.
| iron.algorithms.transform.transform_parallel_binary | ( | func, | |
| first, | |||
| second, | |||
| output, | |||
| * | params, | ||
tile_size = 16 |
|||
| ) |
Apply ``func`` to ``first`` and ``second`` in parallel across all available NPU columns.
Args:
func: Function or :class:`~aie.iron.kernel.ExternalFunction` to apply.
first: First input tensor (NPU-accessible).
second: Second input tensor (NPU-accessible, same shape and dtype as ``first``).
output: Output tensor (NPU-accessible, same shape and dtype as inputs).
*params: Additional parameters forwarded to ``func``.
tile_size (int, optional): Number of elements per tile per column. Defaults to 16.
Returns:
mlir.ir.Module: The compiled MLIR module.