CSCMV Kernel APIs¶
Note
CSCMV implementation only uses one HBM channel on U280 card. In future releases, multiple (up to 32) HBM channels may be used to achieve the maximum performance.
cscRowPktKernel¶
#include "cscRowPktKernel.hpp"
void cscRowPktKernel ( const ap_uint <SPARSE_hbmMemBits>* p_aNnzIdx, const unsigned int p_memBlocks, const unsigned int p_nnzBlocks, const unsigned int p_rowBlocks, hls::stream <SPARSE_parDataPktType>& in, hls::stream <SPARSE_parDataPktType>& out )
cscRowPkt Kernel
Parameters:
p_aNnzIdx | the device memory pointer for read the NNZ values and row indices |
p_memBlocks | the number of device memory accesses to read the NNZ values androw indices |
p_nnzBlocks | the number of parallel NNZ entries |
p_rowBlocks | the number of parallel row vector entries |
in | the input axi stream of column vector entries selected for the NNZs |
out | the output axi stream of result row vector entries |
loadColPtrValKernel¶
#include "loadColPtrValKernel.hpp"
void loadColPtrValKernel ( const ap_uint <SPARSE_ddrMemBits>* p_memColVal, const ap_uint <SPARSE_ddrMemBits>* p_memColPtr, const unsigned int p_memBlocks, const unsigned int p_numTrans, hls::stream <SPARSE_parDataPktType>& out1, hls::stream <SPARSE_parIndexPktType>& out2 )
loadColPtrVal Kernel
Parameters:
p_memColVal | device memory pointer for reading column vector |
p_memColVal | device memory pointer for read column pointers of NNZ entries |
p_memBlocks | number of blocks of vector entries in the memory read operation |
p_numTrans | number of times to trigger this kernel. Currently only support 1 |
out1 | the axi stream of output column vector entries |
out2 | the axi stream of output column pointer entries |
storeDatPktKernel¶
#include "storeDatPktKernel.hpp"
void storeDatPktKernel ( hls::stream <SPARSE_parDataPktType>& in, ap_uint <SPARSE_ddrMemBits>* p_memPtr, unsigned int p_memBlocks )
storeDataPkt Kernel
Parameters:
in | the input axi stream of row vector entries of cscmv operation results |
p_memPtr | the device memory pointer for writing the row vector entries |
p_memBlocks | the number of vector entries in each memory write |
xBarColKernel¶
#include "xBarColKernel.hpp"
void xBarColKernel ( const unsigned int p_colPtrBlocks, const unsigned int p_nnzBlocks, hls::stream <SPARSE_parDataPktType>& in1, hls::stream <SPARSE_parIndexPktType>& in2, hls::stream <SPARSE_parDataPktType>& out )
xBarCol Kernel
Parameters:
p_colPtrBlocks | number of parallel column pointer entries in the axi stream input in2 |
p_nnzBlocks | number of parallel NNZ entries in the input axi stream in1 and output axi stream out |
in1 | input axi stream of parallel column vector entries |
in2 | input axi stream of parallel column pointer entries |
out | output axi stream of parallel column vector entries selected for the NNZs |