This guide provides instructions for building MLIR-AIR for GPU targets without AIE dependencies. Tested on MI300X.
If using the OCI MI300X cluster:
salloc -p amd-arad -N 1 --gres=gpu:2 -t 0-1
srun --pty $SHELL
bash
This is the fastest way to build MLIR-AIR for GPU targets.
# Clone the repository
git clone https://github.com/Xilinx/mlir-air.git
cd mlir-air
# Setup Python environment
source utils/setup_python_packages.sh
# Build LLVM
./utils/clone-llvm.sh
./utils/build-llvm-local.sh llvm
# Build MLIR-AIR for GPU (without AIE)
./utils/build-mlir-air-gpu.sh llvm
# Setup environment
source utils/env_setup_gpu.sh install
The build-mlir-air-gpu.sh script builds MLIR-AIR with:
-DAIR_ENABLE_AIE=OFF - Disables AIE backend dependency-DAIR_ENABLE_GPU=ON - Enables GPU/ROCDL passesFor more control over the build process:
# Clone and setup
git clone https://github.com/Xilinx/mlir-air.git
cd mlir-air
source utils/setup_python_packages.sh
# Build LLVM
./utils/clone-llvm.sh
./utils/build-llvm-local.sh llvm
# Configure MLIR-AIR for GPU
mkdir -p build_gpu && cd build_gpu
cmake .. \
-GNinja \
-DMLIR_DIR=$(pwd)/../llvm/install/lib/cmake/mlir \
-DLLVM_DIR=$(pwd)/../llvm/install/lib/cmake/llvm \
-DAIR_ENABLE_AIE=OFF \
-DAIR_ENABLE_GPU=ON \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=$(pwd)/install
# Build and install
ninja install
With GPU-only build, the following passes are available:
| Pass | Description |
|---|---|
air-to-rocdl |
Lower AIR dialect to GPU/ROCDL dialect |
air-gpu-outlining |
Outline GPU kernels into gpu.module |
air-to-async |
Lower AIR dialect to async dialect |
convert-to-air |
Convert operations to AIR dialect |
AIE-specific passes (e.g., air-to-aie) are not available in this configuration; attempts to invoke them will fail because AIE support is disabled (AIR_ENABLE_AIE=OFF).
The aircc.py compiler supports GPU targets.
# Setup environment first
source utils/env_setup_gpu.sh install
# Compile the 4k x 4k matrix multiplication example for MI300X
aircc.py --target gpu --gpu-arch gfx942 -o output.mlir test/gpu/4k_4k_mul/air_sync.mlir
# With verbose output to see compilation steps
aircc.py --target gpu --gpu-arch gfx942 -v -o output.mlir test/gpu/4k_4k_mul/air_sync.mlir
# Keep intermediate files for debugging
aircc.py --target gpu --gpu-arch gfx942 -v --tmpdir /tmp/mytest -o output.mlir test/gpu/4k_4k_mul/air_sync.mlir
| Option | Default | Description |
|---|---|---|
--target |
aie |
Target backend: aie or gpu |
--gpu-arch |
gfx942 |
GPU architecture |
--gpu-runtime |
HIP |
GPU runtime: HIP or OpenCL |
-o <file> |
stdout | Output file |
--tmpdir <dir> |
auto | Directory for intermediate files |
-v |
off | Show compilation steps |
| Architecture | GPU |
|---|---|
gfx942 |
MI300X, MI300A |
gfx90a |
MI200 series |
gfx908 |
MI100 |
gfx1100 |
RDNA3 (RX 7900) |
The aircc.py GPU pipeline runs the following passes:
air-opt -air-to-rocdl)
air.launch, air.segment, air.herd → gpu.launchair.dma_memcpy_nd → memory operationsair-opt -air-gpu-outlining)
gpu.modulegpu.container_module and gpu.kernel attributesmlir-opt)
gpu-kernel-outliningmlir-opt)
rocdl-attach-targetgpu-module-to-binaryThe test/gpu/4k_4k_mul/ directory contains a matrix multiplication example.
# Setup environment
source utils/env_setup_gpu.sh install
# Compile the example
aircc.py --target gpu --gpu-arch gfx942 -v \
--tmpdir /tmp/matmul \
-o /tmp/matmul/output.mlir \
test/gpu/4k_4k_mul/air_sync.mlir
# View the generated output (contains gpu.binary)
head -50 /tmp/matmul/output.mlir
# View intermediate files
ls /tmp/matmul/
After compilation, the output MLIR contains:
gpu.binary with embedded AMDGPU ELF binarygpu.launch_func calls to invoke the kernelUse mlir-runner with the ROCm runtime library to execute the compiled MLIR:
# Setup environment
source utils/env_setup_gpu.sh install
# Run the compiled output
mlir-runner --entry-point-result=void \
--shared-libs=$LLVM_INSTALL_DIR/lib/libmlir_rocm_runtime.so \
--shared-libs=$MLIR_AIR_INSTALL_DIR/lib/libairgpu.so \
output.mlir
For debugging with ISA output:
mlir-runner --debug-only=serialize-to-isa \
--entry-point-result=void \
--shared-libs=$LLVM_INSTALL_DIR/lib/libmlir_rocm_runtime.so \
--shared-libs=$MLIR_AIR_INSTALL_DIR/lib/libairgpu.so \
output.mlir
Full example (compile and run):
# Setup environment
source utils/env_setup_gpu.sh install
# Compile
aircc.py --target gpu --gpu-arch gfx942 \
-o /tmp/output.mlir \
test/gpu/4k_4k_mul/air_sync.mlir
# Run
mlir-runner --entry-point-result=void \
--shared-libs=$LLVM_INSTALL_DIR/lib/libmlir_rocm_runtime.so \
--shared-libs=$MLIR_AIR_INSTALL_DIR/lib/libairgpu.so \
/tmp/output.mlir
For debugging or customization, you can run the passes manually:
air-opt test/gpu/4k_4k_mul/air_sync.mlir \
-air-to-rocdl \
-o step1_rocdl.mlir
air-opt step1_rocdl.mlir \
-air-gpu-outlining \
-o step2_outlined.mlir
mlir-opt step2_outlined.mlir \
--pass-pipeline="builtin.module(func.func(lower-affine,convert-linalg-to-loops,convert-scf-to-cf),gpu-kernel-outlining)" \
-o step3_gpu.mlir
mlir-opt step3_gpu.mlir \
--pass-pipeline="builtin.module(rocdl-attach-target{chip=gfx942 O=3},gpu.module(convert-gpu-to-rocdl{chipset=gfx942 runtime=HIP},reconcile-unrealized-casts),gpu-module-to-binary,func.func(gpu-async-region),gpu-to-llvm,convert-to-llvm,reconcile-unrealized-casts)" \
-o step4_final.mlir
mlir-runner --entry-point-result=void \
--shared-libs=$LLVM_INSTALL_DIR/lib/libmlir_rocm_runtime.so \
--shared-libs=$MLIR_AIR_INSTALL_DIR/lib/libairgpu.so \
step4_final.mlir
To reactivate the environment from a new terminal:
cd mlir-air
source utils/env_setup_gpu.sh install
Or manually:
export PATH=/path/to/mlir-air/install/bin:/path/to/llvm/install/bin:$PATH
export PYTHONPATH=/path/to/mlir-air/python:$PYTHONPATH
If you see:
ModuleNotFoundError: No module named 'air'
Make sure to source the environment setup:
source utils/env_setup_gpu.sh install
Or add the Python path manually:
export PYTHONPATH=/path/to/mlir-air/python:$PYTHONPATH
If you see errors like:
error: AIRToAIE pass requires AIE support. Rebuild with -DAIR_ENABLE_AIE=ON
This is expected behavior. The GPU-only build does not include AIE backend support. Use --target gpu with aircc.py for GPU targets.
Ensure the tools are in your PATH:
which air-opt mlir-opt
If not found, source the environment:
source utils/env_setup_gpu.sh install
Ensure ROCm is installed and the runtime library path is correct:
export LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH
If you see:
error: cannot evaluate equated symbol 'air_kernel_0.num_named_barrier'
This can be fixed by configuring the rocdl-attach-target pass with wave64=true. Note that this is not enabled by default in aircc.py, so you must pass this option explicitly in your GPU pipeline.
If you need both GPU and AIE backends:
cmake .. \
-GNinja \
-DMLIR_DIR=/path/to/llvm/install/lib/cmake/mlir \
-DLLVM_DIR=/path/to/llvm/install/lib/cmake/llvm \
-DAIE_DIR=/path/to/mlir-aie/install/lib/cmake/aie \
-DAIR_ENABLE_AIE=ON \
-DAIR_ENABLE_GPU=ON \
-DCMAKE_BUILD_TYPE=Release
| Option | Default | Description |
|---|---|---|
AIR_ENABLE_AIE |
ON | Enable AIE backend (requires mlir-aie) |
AIR_ENABLE_GPU |
OFF | Enable GPU backend (ROCDL/HIP) |
Build configurations:
-DAIR_ENABLE_AIE=OFF -DAIR_ENABLE_GPU=ON-DAIR_ENABLE_AIE=ON -DAIR_ENABLE_GPU=OFF-DAIR_ENABLE_AIE=ON -DAIR_ENABLE_GPU=ON (requires mlir-aie)