Dense Pyramidal LK Optical Flow¶
Dense Pyramidal LK Optical Flow example resides in L2/examples/lkdensepyrof
directory.
This benchmark tests the performance of lkdensepyrof function with a pair of images. Optical flow is the pattern of apparent motion of image objects between two consecutive frames, caused by the movement of object or camera. It is a 2D vector field, where each vector is a displacement vector showing the movement of points from first frame to second.
The tutorial provides a step-by-step guide that covers commands for building and running kernel.
Executable Usage¶
Work Directory(Step 1)
The steps for library download and environment setup can be found in README of L2 folder. For getting the design,
cd L2/examples/lkdensepyrof
Build kernel(Step 2)
Run the following make command to build your XCLBIN and host binary targeting a specific device. Please be noticed that this process will take a long time, maybe couple of hours.
export OPENCV_INCLUDE=< path-to-opencv-include-folder >
export OPENCV_LIB=< path-to-opencv-lib-folder >
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:< path-to-opencv-lib-folder >
export DEVICE=< path-to-platform-directory >/< platform >.xpfm
make host xclbin TARGET=hw
Run kernel(Step 3)
To get the benchmark results, please run the following command.
make run TARGET=hw
Example output(Step 4)
-----------Optical Flow Design---------------
Found Platform
Platform Name: Xilinx
XCLBIN File Name: krnl_pyr_dense_optical_flow
INFO: Importing vision/L2/examples/lkdensepyrof/Xilinx_Lkdensepyrof_L2_Test_vitis_hw_u200/build_dir.hw.xilinx_u200_xdma_201830_2/krnl_pyr_dense_optical_flow.xclbin
Loading: 'vision/L2/examples/lkdensepyrof/Xilinx_Lkdensepyrof_L2_Test_vitis_hw_u200/build_dir.hw.xilinx_u200_xdma_201830_2/krnl_pyr_dense_optical_flow.xclbin'
*********Pyr Down Execution*********
CL buffer created
data copied to host
Kernel args set
opencv
0 image 0 level pyrdown done
CL buffer created
data copied to host
Kernel args set
opencv
0 image 1 level pyrdown done
CL buffer created
data copied to host
Kernel args set
opencv
0 image 2 level pyrdown done
CL buffer created
data copied to host
Kernel args set
opencv
0 image 3 level pyrdown done
One image done
CL buffer created
data copied to host
Kernel args set
opencv
1 image 0 level pyrdown done
CL buffer created
data copied to host
Kernel args set
opencv
1 image 1 level pyrdown done
CL buffer created
data copied to host
Kernel args set
opencv
1 image 2 level pyrdown done
CL buffer created
data copied to host
Kernel args set
opencv
1 image 3 level pyrdown done
One image done
*********Pyr Down Done*********
*********Starting OF Computation*********
Buffers created
*********OF Computation Level = 4*********
*********OF Computation iteration = 0*********
Data copied from host to device
kernel args set
4 level 0 calls done
*********OF Computation iteration = 1*********
Data copied from host to device
kernel args set
4 level 1 calls done
*********OF Computation iteration = 2*********
Data copied from host to device
kernel args set
4 level 2 calls done
*********OF Computation iteration = 3*********
Data copied from host to device
kernel args set
4 level 3 calls done
*********OF Computation iteration = 4*********
Data copied from host to device
kernel args set
4 level 4 calls done
Buffers created
*********OF Computation Level = 3*********
*********OF Computation iteration = 0*********
Data copied from host to device
kernel args set
3 level 0 calls done
*********OF Computation iteration = 1*********
Data copied from host to device
kernel args set
3 level 1 calls done
*********OF Computation iteration = 2*********
Data copied from host to device
kernel args set
3 level 2 calls done
*********OF Computation iteration = 3*********
Data copied from host to device
kernel args set
3 level 3 calls done
*********OF Computation iteration = 4*********
Data copied from host to device
kernel args set
3 level 4 calls done
Buffers created
*********OF Computation Level = 2*********
*********OF Computation iteration = 0*********
Data copied from host to device
kernel args set
2 level 0 calls done
*********OF Computation iteration = 1*********
Data copied from host to device
kernel args set
2 level 1 calls done
*********OF Computation iteration = 2*********
Data copied from host to device
kernel args set
2 level 2 calls done
*********OF Computation iteration = 3*********
Data copied from host to device
kernel args set
2 level 3 calls done
*********OF Computation iteration = 4*********
Data copied from host to device
kernel args set
2 level 4 calls done
Buffers created
*********OF Computation Level = 1*********
*********OF Computation iteration = 0*********
Data copied from host to device
kernel args set
1 level 0 calls done
*********OF Computation iteration = 1*********
Data copied from host to device
kernel args set
1 level 1 calls done
*********OF Computation iteration = 2*********
Data copied from host to device
kernel args set
1 level 2 calls done
*********OF Computation iteration = 3*********
Data copied from host to device
kernel args set
1 level 3 calls done
*********OF Computation iteration = 4*********
Data copied from host to device
kernel args set
1 level 4 calls done
Buffers created
*********OF Computation Level = 0*********
*********OF Computation iteration = 0*********
Data copied from host to device
kernel args set
0 level 0 calls done
*********OF Computation iteration = 1*********
Data copied from host to device
kernel args set
0 level 1 calls done
*********OF Computation iteration = 2*********
Data copied from host to device
kernel args set
0 level 2 calls done
*********OF Computation iteration = 3*********
Data copied from host to device
kernel args set
0 level 3 calls done
*********OF Computation iteration = 4*********
Data copied from host to device
kernel args set
0 level 4 calls done
------------------------------------------------------------
Profiling¶
The lkdensepyrof design is validated on Alveo U200 board at 300 MHz frequency. The hardware resource utilizations are listed in the following table.
Dataset |
LUT |
BRAM |
FF |
DSP |
||
---|---|---|---|---|---|---|
Resolution |
NPPC |
other params |
||||
4K |
1 |
5 iterations, 5 levels |
30781 |
182 |
26169 |
83 |
FHD |
1 |
5 iterations, 5 levels |
30839 |
107 |
25714 |
83 |
The performance is shown below
Dataset |
FPS(CPU) |
FPS(FPGA) |
---|---|---|
4k (3840x2160) |
0.15 |
3 |
Full HD(1920x1080) |
0.63 |
12 |