Duplicate Record Match

Duplicate Record Match resides in L2/demos/text/dup_match directory and is to achieve the function of duplicate recoed matching, which includes modules such as Index, Predicate, Pair, Score, Cluster, etc.


  • Input file: Randomly generate 10,000,000 lines (about 1GB) of csv file similar to L2/demos/text/dup_match/data/test.csv as test input file.
  • The Demo execute time 8,215.56 s.
  • Baseline (Dedupe Python: https://github.com/dedupeio/dedupe) execute time 35,030.751 s
  • Accelaration Ratio: 5.1X


1. The baseline version run on Intel(R) Xeon(R) CPU E5-2690 v4, clocked at 2.60GHz.
2. The training result of Baseline includes self.predicate=((TfidfNGramCanopyPredicate: (0.8, Site name), TfidfTextCanopyPredicate: (0.8, Address)), (SimplePredicate: (alphaNumericPredicate, Site name), TfidfTextCanopyPredicate: (0.8, Site name)), (SimplePredicate: (wholeFieldPredicate, Site name), SimplePredicate: (wholeFieldPredicate, Zip))).

Executable Usage

  • Work Directory(Step 1)

The steps for library download and environment setup can be found in Vitis Data Analytics Library. For getting the design,

cd L2/demos/text/dup_match
  • Build kernel(Step 2)

Run the following make command to build your XCLBIN and host binary targeting a specific device. Please be noticed that this process will take a long time, maybe couple of hours.

make run TARGET=hw DEVICE=xilinx_u50_gen3x16_xdma_201920_3 HOST_ARCH=x86
  • Run kernel(Step 3)

To get the benchmark results, please run the following command.

./build_dir.hw.xilinx_u50_gen3x16_xdma_201920_3/host.exe -xclbin ./build_dir.hw.xilinx_u50_gen3x16_xdma_201920_3/TGP_Kernel.xclbin -in ./data/test.csv -golden ./data/golden.txt

Duplicate Record Match Input Arguments:

Usage: host.exe -xclbin <xclbin_name> -in <input data>  -golden <golden data>
       -xclbin:     the kernel name
       -in    :     input data
       -golden:     golden data
  • Example output(Step 4)
---------------------Duplicate Record Matching Flow-----------------
TwoGramPredicate: column map size=14
tf_value_ size is 238, index count=14, term count=122, skip=0
config=15, 316
config=15, 301
Found Platform
Platform Name: Xilinx
Found Device=xilinx_u50_gen3x16_xdma_201920_3
INFO: Importing build_dir.hw.xilinx_u50_gen3x16_xdma_201920_3/TGP_Kernel.xclbin
Loading: 'build_dir.hw.xilinx_u50_gen3x16_xdma_201920_3/TGP_Kernel.xclbin'
kernel has been created
kernel start------
index count=11, term count=65, skip=0
index count=14, term count=36, skip=0
CompoundPredicate: pair size=30
CompoundPredicate: pair size=30
CompoundPredicate: pair size=36
duplicate sets 10
DupMatch::run End
Execution time 8.979s
Pass validation.



The duplicate record match design is validated on Alveo U50 board at 270 MHz frequency. The hardware resource utilizations are listed in the following table.

Table 1 Hardware resources for duplicate record match
Platform 135778 180 0 4
TGP_Kernel 272031 50 260 506
TGP_Kernel_1 135974 25 130 253
TGP_Kernel_2 136057 25 130 253
User Budget 734238 1164 640 5936
Used Resources 272031 50 260 506
Percentage 37.05% 4.30% 40.63% 8.52%
The performance is shown below.
The input file is randomly generated 10,000,000 lines (about 1GB) of csv file similar to L2/demos/text/dup_match/data/test.csv as test input file. And its execute time is 8,215.56 s, so its throughput is 124.64 MB/s.