|
IRON 1a5eed49d3c0721a318ac369f725acc96b7c4584
|
Functions | |
| ExternalFunction | _eltwise_bf16_kernel (str op, int tile_size, dtype, bool vectorized) |
| ExternalFunction | passthrough (int tile_size=4096, dtype=np.int32) |
| ExternalFunction | scale (int tile_size=1024, dtype=np.int32, bool vectorized=True, bool use_chess=False) |
| ExternalFunction | add (int tile_size=1024, dtype=bfloat16, bool vectorized=True) |
| ExternalFunction | mul (int tile_size=1024, dtype=bfloat16, bool vectorized=True) |
| ExternalFunction | relu (int tile_size=1024) |
Variables | |
| int | _ELTWISE_FIXED_TILE = 1024 |
| int | _RELU_FIXED_TILE = 1024 |
Element-wise kernel factories: passthrough, scale, add, mul, relu.
|
protected |
Shared implementation for :func:`add` and :func:`mul`.
| ExternalFunction iron.kernels.eltwise.add | ( | int | tile_size = 1024, |
dtype = bfloat16, |
|||
| bool | vectorized = True |
||
| ) |
Element-wise bf16 addition (tile_size must be 1024, hard-coded in C++).
Args:
tile_size: Elements per tile (must be 1024).
dtype: Element data type (only ``bfloat16`` supported).
vectorized: If ``True`` use vectorized path; ``False`` selects scalar.
Returns:
ExternalFunction for eltwise_add_bf16.
Raises:
ValueError: When ``dtype`` is not ``bfloat16``.
| ExternalFunction iron.kernels.eltwise.mul | ( | int | tile_size = 1024, |
dtype = bfloat16, |
|||
| bool | vectorized = True |
||
| ) |
Element-wise bf16 multiplication (tile_size must be 1024, hard-coded in C++).
Args:
tile_size: Elements per tile (must be 1024).
dtype: Element data type (only ``bfloat16`` supported).
vectorized: If ``True`` use vectorized path; ``False`` selects scalar.
Returns:
ExternalFunction for eltwise_mul_bf16.
Raises:
ValueError: When ``dtype`` is not ``bfloat16``.
| ExternalFunction iron.kernels.eltwise.passthrough | ( | int | tile_size = 4096, |
dtype = np.int32 |
|||
| ) |
Element-wise passthrough kernel: copies input tile to output tile.
Args:
tile_size: Number of elements per tile.
dtype: Element data type (``np.uint8``, ``np.int16``, or ``np.int32``).
Returns:
ExternalFunction configured for ``passThroughLine``.
Raises:
ValueError: When ``dtype`` is not ``np.uint8``, ``np.int16``, or ``np.int32``.
| ExternalFunction iron.kernels.eltwise.relu | ( | int | tile_size = 1024 | ) |
Element-wise bf16 ReLU (tile_size must be 1024, hard-coded in C++).
Args:
tile_size: Elements per tile (must be 1024).
Returns:
ExternalFunction for bf16_relu.
Raises:
ValueError: When ``tile_size`` is not 1024.
| ExternalFunction iron.kernels.eltwise.scale | ( | int | tile_size = 1024, |
dtype = np.int32, |
|||
| bool | vectorized = True, |
||
| bool | use_chess = False |
||
| ) |
Scalar-multiply kernel: multiplies each element of an input tile by a factor.
Args:
tile_size: Number of elements per tile.
dtype: Element data type. Must be ``np.int16`` or ``np.int32``.
vectorized: If ``True`` use the vectorized path; ``False`` selects scalar.
use_chess: When ``True``, build the .o with ``xchesscc_wrapper``
instead of Peano.
Returns:
ExternalFunction configured for the scale kernel.
Raises:
ValueError: When ``dtype`` is not ``np.int16`` or ``np.int32``.
|
protected |
|
protected |