Triangle Counting¶
Triangle Counting example resides in L2/benchmarks/triangle_count
directory. The tutorial provides a step-by-step guide that covers commands for building and running kernel.
Executable Usage¶
- Work Directory(Step 1)
The steps for library download and environment setup can be found in Vitis Graph Library. For getting the design,
cd L2/benchmarks/triangle_count
- Build kernel(Step 2)
Run the following make command to build your XCLBIN and host binary targeting a specific device. Please be noticed that this process will take a long time, maybe couple of hours.
make run TARGET=hw DEVICE=xilinx_u250_xdma_201830_2
- Run kernel(Step 3)
To get the benchmark results, please run the following command.
./build_dir.hw.xilinx_u250_xdma_201830_2/host.exe -xclbin build_dir.hw.xilinx_u250_xdma_201830_2/TC_kernel.xclbin -o data/csr_offsets.txt -i data/csr_columns.txt
Triangle Counting Input Arguments:
Usage: host.exe -[-xclbin -o -i] -xclbin triangle count binary -o offset file of input graph in CSR format -i edge file of input graph in CSR format
Note: Default arguments are set in Makefile, you can use other Datasets listed in the table.
- Example output(Step 4)
---------------------Triangle Count----------------- vertex 2,max_diff=4 Found Platform Platform Name: Xilinx Found Device=xilinx_u250_xdma_201830_2 INFO: Importing build_dir.hw.xilinx_u250_xdma_201830_2/TC_kernel.xclbin Loading: 'build_dir.hw.xilinx_u250_xdma_201830_2/TC_kernel.xclbin' kernel has been created ... kernel start------ ... kernel end------ Execution time 69.292ms Write DDR Execution time 62.7083 ms Kernel Execution time 6.32296 ms Read DDR Execution time 0.098528 ms Total Execution time 69.1972 ms INFO: case pass!
Profiling¶
The hardware resource utilizations are listed in the following table.
Kernel | BRAM | URAM | DSP | LUT | Frequency(MHz) |
TC_Kernel | 62 | 16 | 0 | 21001 | 300 |
The performance is shown in the table below.
Datasets | Vertex | Edges | FPGA time (u250) | Spark (4 threads) | Spark (8 threads) | Spark (16 threads) | Spark (32 threads) | ||||
Spark time | speedup | Spark time | speedup | Spark time | speedup | Spark time | speedup | ||||
as-Skitter | 1694616 | 11094209 | 53.05 | 46.5 | 0.88 | 31.30 | 0.59 | 25.66 | 0.48 | 26.60 | 0.50 |
coPapersDBLP | 540486 | 15245729 | 4.37 | 68.0 | 15.55 | 42.08 | 9.63 | 29.55 | 6.76 | 33.15 | 7.59 |
coPapersCiteseer | 434102 | 16036720 | 6.80 | 74.4 | 10.94 | 38.74 | 5.70 | 37.42 | 5.50 | 33.87 | 4.98 |
cit-Patents | 3774768 | 16518948 | 0.80 | 75.8 | 95.10 | 57.20 | 71.50 | 44.87 | 56.09 | 39.61 | 49.51 |
europe_osm | 50912018 | 54054660 | 1.08 | 577.1 | 534.07 | 295.57 | 273.68 | 221.86 | 205.43 | 144.68 | 133.96 |
hollywood | 1139905 | 57515616 | 113.48 | 395.0 | 3.49 | 246.42 | 2.17 | 220.90 | 1.95 | – | – |
soc-LiveJournal1 | 4847571 | 68993773 | 21.17 | 194.3 | 9.18 | 121.15 | 5.72 | 104.64 | 4.94 | 149.34 | 7.05 |
ljournal-2008 | 5363260 | 79023142 | 19.73 | 223.5 | 11.33 | 146.63 | 7.43 | 171.35 | 8.68 | – | – |
GEOMEAN | 9.47 | 143.2 | 15.1X | 88.54 | 9.4X | 76.05 | 8.0X | 54.27 | 9.8X |
Note