Benchmark¶
Pictures & Performance¶
- JPEG Decoder: This API supports the ‘Sequential DCT-based mode’ of ISO/IEC 10918-1 standard. It is a high-performance implementation based-on Xilinx HLS design methodolygy. It can process 1 Huffman token and create up to 8 DCT coeffiects within one cycle. It is also an easy-to-use decoder as it can direct parser the JPEG file header without help of software functions.
- Pik Encoder: This API is based on Google’s PIK, which was ‘chosen as the base framework for JPEG XL’. The pikEnc is based on the ‘fast mode’ of PIK which can provide better encoding efficnty than most of other still image encoding methods. The pikEnc is based on Xilinx HLS design methodology and optimized for FPGA arthitecture. It can proved higher throughput and lower latency compared to software-based solutions.
our commonly used pictures are listed in table below.
Pictures | Format | Size | Compress ratio | cosim Freq(MHz) | input speed(MB/s) | QPS | time(ms) |
---|---|---|---|---|---|---|---|
lena_c_512.jpg | 420 | 512*512 | 5.2 | 300 | 174 | 2288 | 0.437 |
t0.jpg | 420 | 616*516 | 9.3 | 300 | 147 | 2890 | 0.346 |
android.jpg | 420 | 960*1280 | 14.2 | 300 | 145 | 1125 | 0.889 |
offset.jpg | 422 | 5184*3456 | 4.6 | 300 | 209 | 27 | 37.25 |
hq.jpg | 444 | 5760*3840 | 2.8 | 300 | 233 | 10 | 101.1 |
iphone.jpg | 420 | 3264*2448 | 5.4 | 300 | 213 | 96 | 10.47 |
Pictures | Format | Size | Compress ratio | Freq(MHz) | input speed(MB/s) | QPS | time(ms) |
---|---|---|---|---|---|---|---|
lena_c_512.jpg | 420 | 512*512 | 5.2 | 243 | 87 | 1148 | 0.871 |
t0.jpg | 420 | 616*516 | 9.3 | 243 | 66 | 1292 | 0.774 |
Pictures | Size | Kernel1(ms) | Kernel2(ms) | Kernel3(ms) | Freq(MHz) | input speed(MPIX/s) | QPS |
---|---|---|---|---|---|---|---|
lena_c_512.png | 512*512 | 16 | 14 | 7 | 200 | 16.4 | 62.5 |
lena_c_1024.png | 1024*1024 | 52 | 48 | 24 | 200 | 20.2 | 19.2 |
lena_c_2048.png | 2048*2048 | 191 | 180 | 86 | 200 | 22.0 | 5.2 |
Resource Utilization¶
For representing the resource utilization in each benchmark, we separate the overall utilization into 2 parts, where P stands for the resource usage in platform, that is those instantiated in static region of the FPGA card, as well as K represents those used in kernels (dynamic region). The input is png, jpg, pik, e.g. format, and the target device is set to Alveo U200.
Architecture | Freq | LUT(P/K) | BRAM(P/K) | URAM(P/K) | DSP(P/K) |
---|---|---|---|---|---|
JPEG Huffman Decoder | 270MHz | 108.1K/7.9K | 178/5 | 0/0 | 4/12 |
JPEG Decoder | 243MHz | 108.1K/23.1K | 178/28 | 0/0 | 4/39 |
PIK Encoder | 300MHz | 150.9K/439.4K | 338/62 | 0/16 | 7/0 |
These are details for benchmark result and usage steps.
Test Overview¶
Here are benchmarks of the Vitis Codec Library using the Vitis environment and comparing with cpu().
Vitis Codec Library¶
- Download code
These graph benchmarks can be downloaded from vitis libraries master
branch.
git clone https://github.com/Xilinx/Vitis_Libraries.git cd Vitis_Libraries git checkout master cd codec
- Setup environment
Specifying the corresponding Vitis, XRT, and path to the platform repository by running following commands.
source <intstall_path>/installs/lin64/Vitis/2021.2/settings64.sh source /opt/xilinx/xrt/setup.sh export PLATFORM_REPO_PATHS=/opt/xilinx/platforms