IRON 1a5eed49d3c0721a318ac369f725acc96b7c4584
Loading...
Searching...
No Matches
Functions | Variables
iron.kernels.reduce Namespace Reference

Functions

ExternalFunction _reduce_kernel (str op, int tile_size, dtype, bool vectorized)
 
ExternalFunction reduce_add (int tile_size=1024, dtype=np.int32, bool vectorized=True)
 
ExternalFunction reduce_min (int tile_size=1024, dtype=np.int32, bool vectorized=True)
 
ExternalFunction reduce_max (int tile_size=1024, dtype=np.int32, bool vectorized=True)
 
ExternalFunction compute_max (dtype=np.int32)
 

Variables

str _REDUCE_MAX_OBJ = "reduce_max.cc.o"
 

Detailed Description

Reduction kernel factories: reduce_add, reduce_min, reduce_max, compute_max.

Function Documentation

◆ _reduce_kernel()

ExternalFunction iron.kernels.reduce._reduce_kernel ( str  op,
int  tile_size,
  dtype,
bool   vectorized 
)
protected
Shared implementation for :func:`reduce_add` and :func:`reduce_min`.

◆ compute_max()

ExternalFunction iron.kernels.reduce.compute_max (   dtype = np.int32)
Pairwise scalar max — companion to :func:`reduce_max` for multi-core
reductions where each core produces a partial max and a final tree
reduces them pairwise.

Lives in the same ``reduce_max.cc`` as :func:`reduce_max`; sharing the
output ``.o`` (via ``shared_object_file_name``) means both factories
in the same design compile the source exactly once.

Args:
    dtype: Element data type (``np.int32`` or ``bfloat16``).

Returns:
    ExternalFunction configured for the ``compute_max`` kernel; signature
    is ``(out_ty, out_ty, out_ty)`` where ``out_ty`` is a one-element
    (DMA-aligned) tile of ``dtype``.

Raises:
    ValueError: When ``dtype`` is not ``np.int32`` or ``bfloat16``.

◆ reduce_add()

ExternalFunction iron.kernels.reduce.reduce_add ( int   tile_size = 1024,
  dtype = np.int32,
bool   vectorized = True 
)
Reduction kernel: sums all elements of a tile to a scalar.

Args:
    tile_size: Number of elements in the input tile.
    dtype: Element data type (only ``np.int32`` supported).
    vectorized: If ``True`` use vectorized path; ``False`` selects scalar.

Returns:
    ExternalFunction configured for the reduce_add kernel.

Raises:
    ValueError: When ``dtype`` is not ``np.int32``.

◆ reduce_max()

ExternalFunction iron.kernels.reduce.reduce_max ( int   tile_size = 1024,
  dtype = np.int32,
bool   vectorized = True 
)
Reduction kernel: finds the maximum element of a tile (int32 or bfloat16).

Args:
    tile_size: Number of elements in the input tile.
    dtype: Element data type (``np.int32`` or ``bfloat16``).
    vectorized: If ``True`` use vectorized path; ``False`` selects scalar.

Returns:
    ExternalFunction configured for the reduce_max kernel.

Raises:
    ValueError: When ``dtype`` is not ``np.int32`` or ``bfloat16``.

◆ reduce_min()

ExternalFunction iron.kernels.reduce.reduce_min ( int   tile_size = 1024,
  dtype = np.int32,
bool   vectorized = True 
)
Reduction kernel: finds the minimum element of a tile.

Args:
    tile_size: Number of elements in the input tile.
    dtype: Element data type (only ``np.int32`` supported).
    vectorized: If ``True`` use vectorized path; ``False`` selects scalar.

Returns:
    ExternalFunction configured for the reduce_min kernel.

Raises:
    ValueError: When ``dtype`` is not ``np.int32``.

Variable Documentation

◆ _REDUCE_MAX_OBJ

str iron.kernels.reduce._REDUCE_MAX_OBJ = "reduce_max.cc.o"
protected