Kernel APIs Reference¶
Global Functions¶
xilAdler32¶
#include "adler32_mm.hpp"
void xilAdler32 ( const ap_uint <PARALLEL_BYTES*8>* in, ap_uint <32>* adlerData, uint32_t inSize )
Adler32 kernel takes the raw data as input and generates the adler32 result.
Parameters:
in | input raw data |
adlerData | Adler data |
inSize | input size |
xilChecksum32¶
#include "checksum_mm.hpp"
void xilChecksum32 ( const ap_uint <PARALLEL_BYTES*8>* in, ap_uint <32>* initData, uint32_t inSize, bool checksumType )
Checksum kernel takes the raw data as input and generates the checksum result.
Parameters:
in | input raw data |
initData | input Initial data |
inSize | input size |
checksumType | CRC/ADLER |
xilCrc32¶
#include "crc32_mm.hpp"
void xilCrc32 ( const ap_uint <PARALLEL_BYTES*8>* in, ap_uint <32>* crcData, uint32_t inSize )
Crc32 kernel takes the raw data as input and generates the crc32 result.
Parameters:
in | input raw data |
crcData | CRC data |
inSize | input size |
xilGzipCompressFixedStreaming¶
#include "gzip_compress_fixed_stream.hpp"
void xilGzipCompressFixedStreaming ( hls::stream <ap_axiu <GMEM_IN_DWIDTH, 0, 0, 0>>& inStream, hls::stream <ap_axiu <GMEM_OUT_DWIDTH, 0, 0, 0>>& outStream, hls::stream <ap_axiu <32, 0, 0, 0>>& inSizeStream )
GZIP compression kernel takes the raw data as input and compresses the data in block based fashion and writes the output to global memory. This kernel uses fixed huffman encoding for compression.
Parameters:
inStream | input raw data |
outStream | output compressed data |
inSizeStream | input data size |
xilGzipCompBlock¶
#include "gzip_compress_multicore_mm.hpp"
void xilGzipCompBlock ( const ap_uint <GMEM_DWIDTH>* in, ap_uint <GMEM_DWIDTH>* out, uint32_t* compressd_size, uint32_t* checksumData, uint32_t input_size, bool checksumType )
GZIP compression kernel takes the raw data as input from DDR and compresses the data using num cores and writes the output to global memory.
Parameters:
in | input raw data |
out | output compressed data |
compressd_size | compressed output size of each block |
checksumData | checksum data |
input_size | input data size |
checksumTye | checksum type |
xilGzipComp¶
#include "gzip_compress_multicore_stream.hpp"
void xilGzipComp ( hls::stream <ap_axiu <GMEM_DWIDTH, 0, 0, 0>>& inaxistream, hls::stream <ap_axiu <GMEM_DWIDTH, TUSER_DWIDTH, 0, 0>>& outaxistream )
GZIP streaming compression kernel takes the raw data as input from axi interface and compresses the data using num cores and writes the output to an axi interface.
Parameters:
inaxistream | input raw data |
outaxistream | output compressed data |
xilGzipCompressStreaming¶
#include "gzip_compress_stream.hpp"
void xilGzipCompressStreaming ( hls::stream <ap_axiu <GMEM_IN_DWIDTH, 0, 0, 0>>& inStream, hls::stream <ap_axiu <GMEM_OUT_DWIDTH, 0, 0, 0>>& outStream, hls::stream <ap_axiu <32, 0, 0, 0>>& inSizeStream )
GZIP compression kernel takes the raw data as input and compresses the data in block based fashion and writes the output to global memory.
Parameters:
inStream | input raw data |
outStream | output compressed data |
inSizeStream | input data size |
xilLz4Compress¶
#include "lz4_compress_mm.hpp"
void xilLz4Compress ( const xf::compression::uintMemWidth_t* in, xf::compression::uintMemWidth_t* out, uint32_t* compressd_size, uint32_t* in_block_size, uint32_t block_size_in_kb, uint32_t input_size )
LZ4 compression kernel takes the raw data as input and compresses the data in block based fashion and writes the output to global memory.
Parameters:
in | input raw data |
out | output compressed data |
compressd_size | compressed output size of each block |
in_block_size | input block size of each block |
block_size_in_kb | input block size in bytes |
input_size | input data size |
xilLz4CompressStream¶
#include "lz4_compress_stream.hpp"
void xilLz4CompressStream ( hls::stream <ap_axiu <8, 0, 0, 0>>& inaxistream, hls::stream <ap_axiu <8, 0, 0, 0>>& outaxistream, uint32_t inputSize )
LZ4 compression streaming kernel. It takes input from axi kernel stream and writes compressed data back to output axi kernel stream.
Parameters:
inaxistream | Input axi kernel stream |
outaxistream | Output axi kernel stream |
inputSize | Input compressed data size |
xilLz4Decompress¶
#include "lz4_multibyte_decompress_mm.hpp"
void xilLz4Decompress ( const ap_uint <PARALLEL_BYTE*8>* in, ap_uint <PARALLEL_BYTE*8>* out, uint32_t* in_block_size, uint32_t* in_compress_size, uint32_t block_size_in_kb, uint32_t no_blocks )
LZ4 decompression kernel takes compressed data as input and process in block based fashion and writes the raw data to global memory.
Parameters:
in | input compressed data |
out | output raw data |
in_block_size | input block size of each block |
in_compress_size | compress size of each block |
block_size_in_kb | block size in bytes |
no_blocks | number of blocks |
xilLz4DecompressStream¶
#include "lz4_multibyte_decompress_stream.hpp"
void xilLz4DecompressStream ( hls::stream <ap_axiu <MULTIPLE_BYTES*8, 0, 0, 0>>& inaxistream, hls::stream <ap_axiu <MULTIPLE_BYTES*8, 0, 0, 0>>& outaxistream, hls::stream <ap_axiu <32, 0, 0, 0>>& outaxistreamsize, uint32_t inputSize )
Snappy decompression streaming kernel takes compressed data as input from kernel axi stream and process in block based fashion and writes the raw data to global memory.
Parameters:
inaxistream | input kernel axi stream for compressed data |
outaxistream | output kernel axi stream for decompressed data |
inputSize | input data size |
xilLz4P2PDecompress¶
#include "lz4_p2p_decompress_kernel.hpp"
void xilLz4P2PDecompress ( const xf::compression::uintMemWidth_t* in, xf::compression::uintMemWidth_t* out, dt_blockInfo* bObj, dt_chunkInfo* cObj, uint32_t block_size_in_kb, uint32_t compute_unit, uint8_t total_no_cu, uint32_t num_blocks )
LZ4 P2P decompression kernel is responsible for decompressing data which is in LZ4 encoded form.
Parameters:
in | input stream width |
out | output stream width |
in_block_size | input size |
in_compress_size | output size |
block_start_idx | start index of block |
no_blocks | number of blocks for each compute unit |
block_size_in_kb | block input size |
compute_unit | particular compute unit |
total_no_cu | number of compute units |
num_blocks | number of blocks base don host buffersize |
xilLz4Packer¶
#include "lz4_packer_mm.hpp"
void xilLz4Packer ( uint512_t* in, uint512_t* out, uint32_t* compressd_size, uint32_t* in_block_size, uint32_t* encoded_size, uint512_t* orig_input_data, uint32_t block_size_in_kb, uint32_t no_blocks, uint32_t xxhashVal, uint32_t input_size )
LZ4 packer kernel takes the raw data as input and compresses the data in block based fashion and writes the output to global memory.
Parameters:
in | input raw data |
out | output compressed data |
compressd_size | compressed output size of each block |
in_block_size | input block size of each block |
encoded_size | encoded size of each block |
orig_input_data | raw input data |
block_size_in_kb | input block size in bytes |
no_blocks | number of input blocks |
xxhashVal | Hash Value |
input_size | Total Input File Size |
xilLz4Unpacker¶
#include "lz4_unpacker_kernel.hpp"
void xilLz4Unpacker ( const xf::compression::uintMemWidth_t* in, dt_blockInfo* bObj, dt_chunkInfo* cObj, uint32_t block_size_in_kb, uint8_t first_chunk, uint8_t total_no_cu, uint32_t num_blocks )
LZ4 unpacker kernel is responsible in unpacking LZ4 compressed block information.
Parameters:
in | input stream width |
in_block_size | input block size |
in_compress_size | input compress size |
block_start_idx | start index of each input block |
no_blocks_per_cu | number of blocks for each compute unit |
original_size | original file size |
in_start_index | input start index |
no_blocks | number of blocks |
block_size_in_kb | size of each block |
first_chunk | first chunk to determine header |
total_no_cu | number of decompress compute units |
num_blocks | number of blocks based on host buffersize |
xilSnappyCompress¶
#include "snappy_compress_mm.hpp"
void xilSnappyCompress ( const xf::compression::uintMemWidth_t* in, xf::compression::uintMemWidth_t* out, uint32_t* compressd_size, uint32_t* in_block_size, uint32_t block_size_in_kb, uint32_t input_size )
Snappy compression kernel takes the raw data as input and compresses the data in block based fashion and writes the output to global memory.
Parameters:
in | input raw data |
out | output compressed data |
compressd_size | compressed output size of each block |
in_block_size | input block size of each block |
block_size_in_kb | input block size in bytes |
input_size | input data size |
xilSnappyCompressStream¶
#include "snappy_compress_stream.hpp"
void xilSnappyCompressStream ( hls::stream <ap_axiu <8, 0, 0, 0>>& inaxistream, hls::stream <ap_axiu <8, 0, 0, 0>>& outaxistream, uint32_t inputSize )
Snappy compression streaming kernel takes the raw data as input from kernel axi stream and compresses the data in block based fashion and writes the output to kernel axi stream.
Parameters:
inaxistream | input kernel axi stream for raw data |
outaxistream | output kernel axi stream for compressed data |
inputSize | input data size |
xilSnappyDecompress¶
#include "snappy_decompress_mm.hpp"
void xilSnappyDecompress ( const xf::compression::uintMemWidth_t* in, xf::compression::uintMemWidth_t* out, uint32_t* in_block_size, uint32_t* in_compress_size, uint32_t block_size_in_kb, uint32_t no_blocks )
Snappy decompression kernel takes compressed data as input and process in block based fashion and writes the raw data to global memory.
Parameters:
in | input compressed data |
out | output raw data |
in_block_size | input block size of each block |
in_compress_size | compress size of each block |
block_size_in_kb | block size in bytes |
no_blocks | number of blocks |
xilSnappyDecompressStream¶
#include "snappy_decompress_stream.hpp"
void xilSnappyDecompressStream ( hls::stream <ap_axiu <8, 0, 0, 0>>& inaxistream, hls::stream <ap_axiu <8, 0, 0, 0>>& outaxistream, uint32_t inputSize, uint32_t outputSize )
Snappy decompression streaming kernel takes compressed data as input from kernel axi stream and process in block based fashion and writes the raw data to global memory.
Parameters:
inaxistream | input kernel axi stream for compressed data |
outaxistream | output kernel axi stream for decompressed data |
inputSize | input data size |
outputSize | output data size |
xilSnappyDecompress¶
#include "snappy_multibyte_decompress_mm.hpp"
void xilSnappyDecompress ( const ap_uint <PARALLEL_BYTE*8>* in, ap_uint <PARALLEL_BYTE*8>* out, uint32_t* in_block_size, uint32_t* in_compress_size, uint32_t block_size_in_kb, uint32_t no_blocks )
Snappy decompression kernel takes compressed data as input and process in block based fashion and writes the raw data to global memory.
Parameters:
in | input compressed data |
out | output raw data |
in_block_size | input block size of each block |
in_compress_size | compress size of each block |
block_size_in_kb | block size in bytes |
no_blocks | number of blocks |
xilSnappyDecompressStream¶
xilSnappyDecompressStream overload (1)¶
#include "snappy_multibyte_decompress_stream.hpp"
void xilSnappyDecompressStream ( hls::stream <ap_axiu <MULTIPLE_BYTES*8, 0, 0, 0>>& inaxistream, hls::stream <ap_axiu <MULTIPLE_BYTES*8, 0, 0, 0>>& outaxistream, hls::stream <ap_axiu <32, 0, 0, 0>>& outaxistreamsize, uint32_t inputSize )
Snappy decompression streaming kernel takes compressed data as input from kernel axi stream and process in block based fashion and writes the raw data to global memory.
Parameters:
inaxistream | input kernel axi stream for compressed data |
outaxistream | output kernel axi stream for decompressed data |
outaxistreamsize | output stream data size |
inputSize | input data size |
xilSnappyDecompressStream overload (2)¶
#include "snappy_multicore_decompress_stream.hpp"
void xilSnappyDecompressStream ( hls::stream <ap_axiu <MULTIPLE_BYTES*8, 0, 0, 0>>& inaxistream, hls::stream <ap_axiu <32, 0, 0, 0>>& inaxistreamsize, hls::stream <ap_axiu <MULTIPLE_BYTES*8, 0, 0, 0>>& outaxistream, hls::stream <ap_axiu <32, 0, 0, 0>>& outaxistreamsize )
Snappy decompression streaming kernel takes compressed data as input from kernel axi stream and process in block based fashion and writes the raw data to global memory.
Parameters:
inaxistream | input kernel axi stream for compressed data |
inaxistreamsize | input stream data size |
outaxistream | output kernel axi stream for decompressed data |
outaxistreamsize | output stream data size |
xilZlibCompressFull¶
#include "zlib_compress_multi_engine_mm.hpp"
void xilZlibCompressFull ( const ap_uint <GMEM_DWIDTH>* in, ap_uint <GMEM_DWIDTH>* out, uint32_t* compressd_size, uint32_t input_size )
ZLIB compression kernel takes the raw data as input and compresses the data in parallel block based fashion and writes the output to global memory.
Parameters:
in | input raw data |
out | output compressed data |
compressd_size | compressed output size of each block |
input_size | input data size |
xilHuffmanKernel¶
#include "zlib_huffman_enc_mm.hpp"
void xilHuffmanKernel ( xf::compression::uintMemWidth_t* in, uint32_t* lit_freq, uint32_t* dist_freq, xf::compression::uintMemWidth_t* out, uint32_t* in_block_size, uint32_t* compressd_size, uint32_t block_size_in_kb, uint32_t input_size )
Huffman kernel top function. This is an initial version of Huffman Kernel which does block based bit packing process. It uses dynamic huffman codes and bit lengths to encode the LZ77 (Byte Compressed Data) output. This version operates on 1MB block data per engine as this is suitable for use cases where raw data is over >100MB and compression ratio is over 2.5x in order to achieve best throughput. This can be further optimized to achieve better throughput for smaller file usecase.
Parameters:
in | input stream |
out | output stream |
in_block_size | input block size |
compressd_size | output compressed size |
dyn_litmtree_codes | input literal and match length codes |
dyn_distree_codes | input distance codes |
dyn_bitlentree_codes | input bit-length codes |
dyn_litmtree_blen | input literal and match length bit length data |
dyn_dtree_blen | input distance bit length data |
dyn_bitlentree_blen | input bit-length of bit length data |
dyn_max_codes | input maximum codes |
block_size_in_kb | input block size in bytes |
input_size | input data size |
xilLz77Compress¶
#include "zlib_lz77_compress_mm.hpp"
void xilLz77Compress ( const xf::compression::uintMemWidth_t* in, xf::compression::uintMemWidth_t* out, uint32_t* compressd_size, uint32_t* in_block_size, uint32_t* dyn_ltree_freq, uint32_t* dyn_dtree_freq, uint32_t block_size_in_kb, uint32_t input_size )
LZ77 compression kernel takes the raw data as input and compresses the data in block based fashion and writes the output to global memory. LZ77 is a byte based compression scheme. The resulting output from this kernel is represented in packet form of 32bit length <Literal, Match Length, Distance>. It also generates output of literal and distance frequencies for dynamic huffman tree generation. The output generated by this kernel is referred by TreeGen and Huffman Kernels.
Parameters:
in | input stream |
out | output stream |
compressd_size | compressed output size of each block |
in_block_size | input block size of each block |
dyn_ltree_freq | literal frequency data |
dyn_dtree_freq | distance frequency data |
block_size_in_kb | input block size in bytes |
input_size | input data size |
xilTreegenKernel¶
#include "zlib_treegen_mm.hpp"
void xilTreegenKernel ( uint32_t* dyn_ltree_freq, uint32_t* dyn_dtree_freq, uint32_t* dyn_bltree_freq, uint32_t* dyn_ltree_codes, uint32_t* dyn_dtree_codes, uint32_t* dyn_bltree_codes, uint32_t* dyn_ltree_blen, uint32_t* dyn_dtree_blen, uint32_t* dyn_bltree_blen, uint32_t* max_codes, uint32_t block_size_in_kb, uint32_t input_size, uint32_t blocks_per_chunk )
This is a resource optimized version of huffman treegen kernel. It takes literal and distance frequency data as input through single input stream and generates dynamic huffman codes and bit length data which is output through a single output stream. This kernel does not use DDR in any way and is optimised for both speed and low resource usage.
Parameters:
freqStream | 24-bit input stream for getting frequency data |
codeStream | 20-bit output stream sending huffman codes and bit-lengths data |
xilZstdCompress¶
xilZstdCompress overload (1)¶
#include "zstd_compress_multicore_stream.hpp"
void xilZstdCompress ( hls::stream <ap_axiu <STREAM_IN_DWIDTH, 0, 0, 0>>& axiInStream, hls::stream <ap_axiu <STREAM_OUT_DWIDTH, 0, 0, 0>>& axiOutStream )
ZSTD compression kernel takes input data from axi stream and compresses it into multiple frames having 1 block each and writes the compressed data to output axi stream.
Parameters:
inStream | input raw data |
outStream | output compressed data |
xilZstdCompress overload (2)¶
#include "zstd_compress_stream.hpp"
void xilZstdCompress ( hls::stream <ap_axiu <STREAM_IN_DWIDTH, 0, 0, 0>>& axiInStream, hls::stream <ap_axiu <STREAM_OUT_DWIDTH, 0, 0, 0>>& axiOutStream )
ZSTD compression kernel takes input data from axi stream and compresses it into multiple frames having 1 block each and writes the compressed data to output axi stream.
Parameters:
inStream | input raw data |
outStream | output compressed data |
xilZstdDecompressStream¶
#include "zstd_decompress_stream.hpp"
void xilZstdDecompressStream ( hls::stream <ap_axiu <c_instreamDWidth, 0, 0, 0>>& inaxistreamd, hls::stream <ap_axiu <c_outstreamDWidth, 0, 0, 0>>& outaxistreamd )
This is full ZStandard decompression streaming kernel function. It supports all block sizes and supports window size upto 128KB. It takes entire ZStd compressed file as input and produces decompressed file at the kernel output stream. This kernel does not use DDR memory, it uses streams instead. Intermediate data is stored in internal BRAMs and stream FIFOs, which helps to attain better decompression throughput.
Parameters:
inaxistreamd | input kernel axi stream |
outaxistreamd | output kernel axi stream |