L3 Python bindings¶

Vitis BLAS level 3 provides Python bindings that users could use Vitis BLAS libraries in Python.

1. Introduction¶

1.1 Set Python Environment¶

Please refer to Python environment setup guide.

1.2 Build shared library¶

L3 Python bindings use ctypes to wrap the L3 API functions in pure Python. In order to call these Python functions, users need to build the xfblas.so by Makefile in L3/src/sw/python_api locally.

2. Using the Vitis BLAS L3 Python API¶

2.1 General description¶

This section describes how to use the Vitis BLAS library API level Python bindings. To use the library, users need to source PYTHONPATH to the directory of xfblas_L3.py and import xfblas_L3 as xfblas at the beginning of the Python file.

2.1.1 Vitis BLAS initialization¶

To initialize the library, call the following two functions.

import xfblas_L3 as xfblas
args, xclbin_opts = xfblas.processCommandLine()
xfblas.createGemm(args,xclbin_opts,1,0)

2.2 Vitis BLAS Helper Function Reference¶

class xfblas_L3.XFBLASManager(libFile)[source]¶

createGemm(xclbin, numKernel, idxDevice)[source]¶

create Gemm Handle

Parameters

xclbin: file path for FPGA bitstream
numKernel: number of CUs in the xclbin
idxDeivce: index of local device to be used

destroy(numKernel, idxDevice)[source]¶

release handle used by the XFBLAS library

Parameters

numKernel: number of CUs in the xclbin
idxDeivce: index of local device to be used

execute(idxKernel, idxDevice)[source]¶

run ith kernel

Parameters

idxKernel: int: index of kernel to be used
idxDeivce: int: index of local device to be used

executeAsync(numKernel, idxDevice)[source]¶

run number of kernels async

Parameters

numKernel: number of CUs in the xclbin
idxDeivce: index of local device to be used

freeInstr(idxKernel, idxDevice)[source]¶

free memory for instructions

Parameters

idxKernel: index of kernel to be used
idxDeivce: index of local device to be used

freeMat(A, idxKernel, idxDevice)[source]¶

free device memory for mat A

Parameters

A: ndarray: matrix in host memory
idxKernel: int: index of kernel to be used
idxDeivce: int: index of local device to be used

gemmOp(A, B, C, idxKernel, idxDevice)[source]¶

perform matrix-matrix multiplication of C=A*B

Parameters

A: ndarray: matrix in host memory
B: ndarray: matrix in host memory
C: ndarray: matrix in host memory
idxKernel: int: index of kernel to be used
idxDeivce: int: index of local device to be used

getMat(A, idxKernel, idxDevice)[source]¶

get mat from device to host

Parameters

A: ndarray: matrix in host memory
idxKernel: int: index of kernel to be used
idxDeivce: int: index of local device to be used

sendMat(A, idxKernel, idxDevice)[source]¶

send mat from host to device

Parameters

A: ndarray: matrix in host memory
idxKernel: int: index of kernel to be used
idxDeivce: int: index of local device to be used

2.3 Using Python APIs¶

Please refer to L3/src/sw/python_api/test_gemm.py for using Python APIs to test gemm. To run that case in hw, use the following steps - Build shared library - set PYTHONPATH - find the path to the xclbin and run the command

source /opt/xilinx/xrt/setup.sh
cd L3/src/sw/python_api/
make api
export PYTHONPATH=./:../../../../L1/tests/sw/python/
python test_gemm.py  --xclbin PATH_TO_GEMM_XCLBIN/blas.xclbin --cfg PATH_TO_GEMM_XCLBIN/config_info.dat --lib ./lib/xfblas.so