IRON 1a5eed49d3c0721a318ac369f725acc96b7c4584
Loading...
Searching...
No Matches
Functions | Variables
iron.kernels.eltwise Namespace Reference

Functions

ExternalFunction _eltwise_bf16_kernel (str op, int tile_size, dtype, bool vectorized)
 
ExternalFunction passthrough (int tile_size=4096, dtype=np.int32)
 
ExternalFunction scale (int tile_size=1024, dtype=np.int32, bool vectorized=True, bool use_chess=False)
 
ExternalFunction add (int tile_size=1024, dtype=bfloat16, bool vectorized=True)
 
ExternalFunction mul (int tile_size=1024, dtype=bfloat16, bool vectorized=True)
 
ExternalFunction relu (int tile_size=1024)
 

Variables

int _ELTWISE_FIXED_TILE = 1024
 
int _RELU_FIXED_TILE = 1024
 

Detailed Description

Element-wise kernel factories: passthrough, scale, add, mul, relu.

Function Documentation

◆ _eltwise_bf16_kernel()

ExternalFunction iron.kernels.eltwise._eltwise_bf16_kernel ( str  op,
int  tile_size,
  dtype,
bool   vectorized 
)
protected
Shared implementation for :func:`add` and :func:`mul`.

◆ add()

ExternalFunction iron.kernels.eltwise.add ( int   tile_size = 1024,
  dtype = bfloat16,
bool   vectorized = True 
)
Element-wise bf16 addition (tile_size must be 1024, hard-coded in C++).

Args:
    tile_size: Elements per tile (must be 1024).
    dtype: Element data type (only ``bfloat16`` supported).
    vectorized: If ``True`` use vectorized path; ``False`` selects scalar.

Returns:
    ExternalFunction for eltwise_add_bf16.

Raises:
    ValueError: When ``dtype`` is not ``bfloat16``.

◆ mul()

ExternalFunction iron.kernels.eltwise.mul ( int   tile_size = 1024,
  dtype = bfloat16,
bool   vectorized = True 
)
Element-wise bf16 multiplication (tile_size must be 1024, hard-coded in C++).

Args:
    tile_size: Elements per tile (must be 1024).
    dtype: Element data type (only ``bfloat16`` supported).
    vectorized: If ``True`` use vectorized path; ``False`` selects scalar.

Returns:
    ExternalFunction for eltwise_mul_bf16.

Raises:
    ValueError: When ``dtype`` is not ``bfloat16``.

◆ passthrough()

ExternalFunction iron.kernels.eltwise.passthrough ( int   tile_size = 4096,
  dtype = np.int32 
)
Element-wise passthrough kernel: copies input tile to output tile.

Args:
    tile_size: Number of elements per tile.
    dtype: Element data type (``np.uint8``, ``np.int16``, or ``np.int32``).

Returns:
    ExternalFunction configured for ``passThroughLine``.

Raises:
    ValueError: When ``dtype`` is not ``np.uint8``, ``np.int16``, or ``np.int32``.

◆ relu()

ExternalFunction iron.kernels.eltwise.relu ( int   tile_size = 1024)
Element-wise bf16 ReLU (tile_size must be 1024, hard-coded in C++).

Args:
    tile_size: Elements per tile (must be 1024).

Returns:
    ExternalFunction for bf16_relu.

Raises:
    ValueError: When ``tile_size`` is not 1024.

◆ scale()

ExternalFunction iron.kernels.eltwise.scale ( int   tile_size = 1024,
  dtype = np.int32,
bool   vectorized = True,
bool   use_chess = False 
)
Scalar-multiply kernel: multiplies each element of an input tile by a factor.

Args:
    tile_size: Number of elements per tile.
    dtype: Element data type. Must be ``np.int16`` or ``np.int32``.
    vectorized: If ``True`` use the vectorized path; ``False`` selects scalar.
    use_chess: When ``True``, build the .o with ``xchesscc_wrapper``
        instead of Peano.

Returns:
    ExternalFunction configured for the scale kernel.

Raises:
    ValueError: When ``dtype`` is not ``np.int16`` or ``np.int32``.

Variable Documentation

◆ _ELTWISE_FIXED_TILE

int iron.kernels.eltwise._ELTWISE_FIXED_TILE = 1024
protected

◆ _RELU_FIXED_TILE

int iron.kernels.eltwise._RELU_FIXED_TILE = 1024
protected