IRON 1a5eed49d3c0721a318ac369f725acc96b7c4584
Loading...
Searching...
No Matches
Functions | Variables
iron.kernels._common Namespace Reference

Functions

str _detect_arch ()
 
Path _kernel_source (str arch, str subdir, str filename)
 
list[str] _include_dirs ()
 
int _dtype_to_bit_width (dtype, *str factory_name)
 
tuple[str, list[str]] _conv_act_dtype_info (str base_name, act_dtype, *str factory_name)
 
None _require_fixed_tile_size (str factory_name, int tile_size, int expected=1024)
 
int _min_dma_aligned_elems (dtype, int align=4)
 
Path _default_source_path (str filename, str|None subdir=None)
 
 _arg_type_key (t)
 
ExternalFunction _make_extern (str func_name, "Path | str" source_path, list arg_types, *list[str]|None compile_flags=None, bool use_chess=False, str|None shared_object_file_name=None)
 

Variables

 _log = logging.getLogger(__name__)
 
dict _DTYPE_BIT_WIDTHS
 
dict _EXTERN_CACHE = {}
 

Detailed Description

Shared helpers for the kernels submodules.

Function Documentation

◆ _arg_type_key()

iron.kernels._common._arg_type_key (   t)
protected
Hashable key for one entry of ``arg_types`` (used by ``_EXTERN_CACHE``).

◆ _conv_act_dtype_info()

tuple[str, list[str]] iron.kernels._common._conv_act_dtype_info ( str  base_name,
  act_dtype,
*str   factory_name 
)
protected
Map ``act_dtype`` to ``(func_name, compile_flags)`` for conv kernels.

Raises:
    ValueError: When *act_dtype* is not ``np.int8`` or ``np.uint8``.

◆ _default_source_path()

Path iron.kernels._common._default_source_path ( str  filename,
str | None   subdir = None 
)
protected
Return ``_kernel_source(arch, subdir or arch, filename)`` using the active arch.

◆ _detect_arch()

str iron.kernels._common._detect_arch ( )
protected
Return ``'aie2p'`` or ``'aie2'`` based on the active device.

Falls back to ``'aie2'`` if no device is currently set.

◆ _dtype_to_bit_width()

int iron.kernels._common._dtype_to_bit_width (   dtype,
*str  factory_name 
)
protected
Map ``np.uint8 | np.int16 | np.int32`` to 8/16/32.

Raises:
    ValueError: When *dtype* is not one of the three supported types.

◆ _include_dirs()

list[str] iron.kernels._common._include_dirs ( )
protected
Return the standard include directory list for kernel compilation.

◆ _kernel_source()

Path iron.kernels._common._kernel_source ( str  arch,
str  subdir,
str  filename 
)
protected
Return the absolute path to a kernel source file.

Args:
    arch: Target architecture string (``'aie2'`` or ``'aie2p'``).
    subdir: Subdirectory under ``aie_kernels/`` (e.g. ``'aie2'``).
    filename: Source file name (e.g. ``'scale.cc'``).

Returns:
    Path to the source file.

Raises:
    FileNotFoundError: When the source file cannot be found.

◆ _make_extern()

ExternalFunction iron.kernels._common._make_extern ( str  func_name,
"Path | str"  source_path,
list  arg_types,
*list[str] | None   compile_flags = None,
bool   use_chess = False,
str | None   shared_object_file_name = None 
)
protected
Construct (or reuse) an ExternalFunction with the standard include_dirs.

Memoized on (func_name, source_path, arg_types, compile_flags,
use_chess) so repeated calls with identical parameters return the
SAME ExternalFunction instance (see ``_EXTERN_CACHE`` for rationale).

Different parameterizations get distinct instances AND distinct
``object_file_name``s — the latter is auto-suffixed with a short
digest of the cache key so per-parameterization .o files don't
overwrite each other on disk.  The default ``<name>.o`` is preserved
when ``compile_flags`` is empty AND ``use_chess`` is False (no
parameterization to disambiguate).

``use_chess`` selects the Chess (xchesscc) compiler instead of Peano
for this kernel's .o build.  See
:class:`aie.iron.kernel.ExternalFunction` for the design-level
contract: all EFs in a single ``@iron.jit`` design must share the
same toolchain choice (mixed peano/chess is rejected at compile
time).

``shared_object_file_name`` pins the output ``.o`` filename so
multiple factories targeting the SAME source file (e.g. companion
symbols like ``reduce_max_vector`` + ``compute_max`` both in
``reduce_max.cc``) can share one compile.  The first call builds
the ``.o``; subsequent calls with the same ``shared_object_file_name``
skip the build and link against the existing one.  Without this,
each factory would produce a distinct ``.o`` each carrying ALL
symbols from the ``.cc``, tripping a duplicate-symbol link error.

◆ _min_dma_aligned_elems()

int iron.kernels._common._min_dma_aligned_elems (   dtype,
int   align = 4 
)
protected
Return the minimum element count whose byte size is a multiple of *align*.

The NPU shim DMA requires a 4-byte alignment.  A 1-element output tile is
fine for ``int32`` (4 bytes) but only 2 bytes for ``bfloat16`` — kernels
whose C++ side writes a single value still need a Python tile type with
enough elements to satisfy the alignment.

◆ _require_fixed_tile_size()

None iron.kernels._common._require_fixed_tile_size ( str  factory_name,
int  tile_size,
int   expected = 1024 
)
protected
Raise ValueError when ``tile_size`` does not match a hard-coded C++ loop bound.

Variable Documentation

◆ _DTYPE_BIT_WIDTHS

dict iron.kernels._common._DTYPE_BIT_WIDTHS
protected
Initial value:
1= {
2 np.dtype(np.uint8): 8,
3 np.dtype(np.int16): 16,
4 np.dtype(np.int32): 32,
5}

◆ _EXTERN_CACHE

dict iron.kernels._common._EXTERN_CACHE = {}
protected

◆ _log

iron.kernels._common._log = logging.getLogger(__name__)
protected