Release Note


Following is the 2021.2 release notes.

  • ZSTD Quad-Core Compression
    Created ZSTD Multi-Core architecture to provide high throughput for single file compression. Using Zstd Quad core solution, user can get throughput > 1 GB/s.
  • Zstd Decompress Improvement
    ZSTD Decompress optimized in this release. Overall resource is reduced to 19.6K and achieve 20% higher throughput compare to previous release.
  • GZIP Decompress Improvement
    Re-architected GZIP Decompress cores to reduce resource to 6.9K and better throughput compare to previous release. With this new latency overall IP latency is also reduced to ~1.5K cycle. Provided ZLIB decompression containing ADLR32 Checksum to catch any error in input file. Added functionality to provide uncompressed size in output stream port TUSER (incase end application needs to know uncompressed size).
  • GZIP Compression Improvement
    • Created various ZLIB/GZIP Octa-Core Compression Kernels for different block sizes (8KB, 16KB, 32KB) and achieved > 2GB/s throughput for all variants. Updated IP core to provided compressed size in output axis stream TUSER port (incase any application needs compressed size).
    • Huffman TreeGen latency is reduced significantly < 1K, as a result, for multi-core architectures (Octa-core), a single Treegen is required. This reduce the resource requirement signficantly down for 8KB and 16KB blocksize compression core compare to previous release solution.
    • Compression ratio is improved from 2.67 to 2.7 for Silesia Fileset for 32KB bloksize.
  • Snappy/LZ4 Decompress Improvement
    Optimized Snappy and LZ4 Decompress throughput.


Following is the 2021.1 release notes.

  • GZIP Multi Core Compression
    New GZIP Multi-Core Compress Streaming Accelerator which is purely stream only solution (free running kernel), it comes with many variant of different block size support of 4KB, 8KB, 16KB and 32KB.
  • Facebook ZSTD Compression Core
    New Facebook ZSTD Single Core Compression accelerator with block size 32KB. Multi-cores ZSTD compression is in progress (for higher throughput).
  • GZIP Low Latency Decompression
    A new version of GZIP decompress with improved latency for each block, lesser resources (35% lower LUT, 83% lower BRAM) and improved FMax.
  • ZLIB Whole Application Acceleration using U50
    L3 GZIP solution for U50 Platform, containing 6 Compression core to saturate full PCIe bandwidth. It is provided with Efficient GZIP SW Solution to accelerate CPU library which provide seamless Inflate and deflate API level integration to end customer software without recompiling.
  • Versal Platform Supports


Following is the 2020.2 release notes.

  • LIBZ Library Acceleration using U50
    • Achieved seamless acceleration of libz standard APIs (deflate, compress2 and uncompress)
    • Ready to use library to accelerate any host code without any code change
    • Provided xzlib standalone executable for for both gzip/zlib compress & decompress
  • New ZSTD Decompression
    Facebook ZSTD decompression implemented
  • New Snappy Dual Core Kernel
    Google snappy Dual Core Decompression to get 2x throughput for single file decompress.
  • New GZIP Compress Kernel
    Implemented new GZIP Quad Core Compress Kernel (in build , LZ77 , TreeGen, Huffman encoder). Reduced overall resource >20%, and reduce 50% DDR bandwidth requirement.
  • New GZIP Compress Streaming Kernel
    Implemented fully standard compliance GZIP(include header & footer) streaming free running kernels.
  • GZIP/ZLIB L3 Application on U50
    Provided GZIP/ZLIB Application in L3 , optimized for U50 (HBM) and U250. Single xclbin supports both zlib & gzip format for compress and uncompress
  • Porting Library to U50
    Most of Library functions (LZ4, Snappy, GZIP, ZLIB) ported to U50 platform.
  • Low Latency GZIP/ZLIB Decompress
    Reduced Decompression initial latency from 5K to 2.5K (latency improvement for for small block size 4KB/8KB/16KB)