``for_each``: apply a function in-place over a tiled tensor on an AIE core.
| iron.algorithms.for_each._for_each_real |
( |
|
func, |
|
|
|
tensor, |
|
|
* |
params, |
|
|
|
tile_size = 16 |
|
) |
| |
|
protected |
In-place transform. Internally uses separate input/output ObjectFifos,
but fills and drains to same tensor.
Args:
func: Function to apply, either a lambda/callable or ExternalFunction.
For ExternalFunction, arg_types should be [input_tile, output_tile, *params]
tensor: The tensor to apply in-place transformation
*params: Additional parameters for ExternalFunction only.
Scalar dtypes (np.int32, etc.) are passed as MLIR constants;
array types are transferred via ObjectFifos.
tile_size: Size of each tile processed by a worker (default: 16)
Example::
# kernel has separate in/out tile buffers, but only one tensor is passed
scale = ExternalFunction("scale", arg_types=[tile_ty, tile_ty, scalar_ty, np.int32], ...)
for_each(scale, tensor, factor, tile_size=16)
Returns:
mlir.ir.Module: The compiled MLIR module ready for execution.
| iron.algorithms.for_each.for_each |
( |
|
func, |
|
|
|
tensor_ty, |
|
|
|
tile_size = 16 |
|
) |
| |
In-place transform using a tensor type descriptor.
Accepts a numpy ``ndarray`` type descriptor instead of a real tensor.
Intended for use inside ``@iron.jit`` generator bodies where shape and
dtype are expressed as ``CompileTime[T]`` parameters::
@iron.jit
def my_design(data: InOut,
N: CompileTime[int], dtype: CompileTime[type] = np.int32):
tensor_ty = np.ndarray[(N,), np.dtype[dtype]]
return iron.algorithms.for_each(lambda x: x + 1, tensor_ty)
Args:
func: Function or :class:`~aie.iron.kernel.ExternalFunction` to apply.
tensor_ty: A numpy ``ndarray`` type (e.g. ``np.ndarray[(1024,),
np.dtype[np.int32]]``). Shape and dtype are inferred from this.
tile_size (int, optional): Number of elements per tile. Defaults to 16.
Returns:
mlir.ir.Module: The compiled MLIR module.