Benchmark

Performance

Conjugate Gradient Algorithm

Here are benchmarks of the Vitis HPC Library using the Vitis environment and comparing results on several FPGA and CPU platforms. It supports software and hardware emulation as well as running hardware accelerators on the Alveo U250, U280 or U50.

GEMV-based CG

The following table lists the resource utilization for GEMV-based CG kernel with 16 HBM channels storing the matrix.

Resource Utilization on U50
Name LUT LUTAsMem REG BRAM URAM DSP
User Budget 699619 [100.00%] 369603 [100.00%] 1447189 [100.00%] 1112 [100.00%] 640 [100.00%] 5936 [100.00%]
Used Resources 186448 [ 26.65%] 17334 [ 4.69%] 325149 [ 22.47%] 128 [ 11.51%] 0 [ 0.00%] 1262 [ 21.26%]
Benchmark Results on U50
Vector Size Time per Iteration [ms] U50 Performance [GFLOPS] U50 Energy Efficiency [GFLOPS/W] CPU Performance [GFLOPS] Acceleration Ratio
1024 0.073 26.938 0.723 12.996 2.073
2048 0.2557 30.658 0.766 27.469 1.116
4096 0.9202 34.018 0.812 7.776 4.375
8192 3.405 36.742 0.839 8.226 4.467

SPMV-based CG

The following table lists the resource utilization for SPMV-based CG kernel.

Resource Utilization on U280
Name LUT LUTAsMem REG BRAM URAM DSP
User Budget 1104369 [100.00%] 552814 [100.00%] 2217989 [100.00%] 1693 [100.00%] 896 [100.00%] 9020 [100.00%]
Used Resources 285372 [ 25.84%] 36605 [ 6.62%] 442368 [ 19.94%] 267 [ 15.77%] 64 [ 7.14%] 1192 [ 13.22%]
Benchmark Results on U280
Matrix Name Rows/Cols NNZs Padded Rows/Cols Padded NNZs Padding Ratio No. iterations Time per Iter [ms] Time per Iter on CPU [ms] Acceleration Ratio
nasa2910 2910 174296 2912 297952 1.70946 1777 0.0511172 0.0692836 1.36
ex9 3363 99471 3364 199328 2.00388 5000 0.0497677 0.0559332 1.12
bcsstk24 3562 159910 3564 222656 1.39238 5000 0.0598962 0.0581827 0.97
bcsstk15 3948 117816 3948 267488 2.27039 658 0.0927269 0.125615 1.35
bcsstk28 4410 219024 4412 319264 1.45767 4878 0.0586356 6.92198 118.05
s3rmt3m3 5357 207695 5360 330624 1.59187 5000 0.0744822 6.55229 87.97
s2rmq4m1 5489 281111 5492 427648 1.52128 1779 0.084562 6.75384 79.87
nd3k 9000 3279690 9000 4277792 1.30433 5000 0.363479 4.66861 12.84
ted_B 10605 144579 10608 548416 3.79319 30 0.984467 6.53108 6.63
ted_B_unscaled 10605 144579 10608 548416 3.79319 16 1.75354 8.59891 4.90
msc10848 10848 1229778 10848 2050720 1.66755 5000 0.230942 5.43921 23.55
cbuckle 13681 676515 13684 924832 1.36705 1282 0.16427 5.48588 33.40
olafu 16146 1015156 16148 1452320 1.43064 5000 0.169174 5.05108 29.86
gyro_k 17361 1021159 17364 1932384 1.89234 5000 0.254172 4.85938 19.12
bodyy4 17546 121938 17548 710112 5.82355 230 0.174435 4.73164 27.13
nd6k 18000 6897316 18000 9415552 1.3651 5000 0.809868 4.25772 5.26
raefsky4 19779 1328611 19780 2268704 1.70758 5000 0.268956 4.22843 15.72
bcsstk36 23052 1143140 23052 1833056 1.60353 5000 0.253049 3.9882 15.76

These are details for benchmark result and usage steps.

Benchmark Overview

Vitis HPC Library

  • Download code

These hpc benchmarks can be downloaded from vitis libraries master branch.

git clone https://github.com/Xilinx/Vitis_Libraries.git
cd Vitis_Libraries
git checkout master
cd hpc
  • Setup environment

Specifying the corresponding Vitis, XRT, and path to the platform repository by running following commands. Set up Python environment with Python environment setup guide

source <intstall_path>/installs/lin64/Vitis/2021.1/settings64.sh
source /opt/xilinx/xrt/setup.sh
export PLATFORM_REPO_PATHS=/opt/xilinx/platforms