.. Copyright 2021 Xilinx, Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. .. _l3_cornertracker: ============== Corner Tracker ============== Corner Tracker example resides in ``L3/examples/cornertracker`` directory. This benchmark tests the performance of `cornertracker` function with a sequence of 2 images. This example illustrates how to detect and track the characteristic feature points in a set of successive frames of video. A Harris corner detector is used as the feature detector, and a modified version of Lucas Kanade optical flow is used for tracking. The core part of the algorithm takes in current and next frame as the inputs and outputs the list of tracked corners. The current image is the first frame in the set, then corner detection is performed to detect the features to track. The tutorial provides a step-by-step guide that covers commands for building and running kernel. Executable Usage ================ * **Work Directory(Step 1)** The steps for library download and environment setup can be found in README of L3 folder. For getting the design, .. code-block:: bash cd L3/examples/cornertracker * **Build kernel(Step 2)** Run the following make command to build your XCLBIN and host binary targeting a specific device. Please be noticed that this process will take a long time, maybe couple of hours. .. code-block:: bash export OPENCV_INCLUDE=< path-to-opencv-include-folder > export OPENCV_LIB=< path-to-opencv-lib-folder > export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:< path-to-opencv-lib-folder > export DEVICE=< path-to-platform-directory >/< platform >.xpfm make host xclbin TARGET=hw * **Run kernel(Step 3)** To get the benchmark results, please run the following command. .. code-block:: bash make run TARGET=hw * **Example output(Step 4)** .. code-block:: bash -----------Corner Tracker Design--------------- Found Platform Platform Name: Xilinx XCLBIN File Name: krnl_cornertracker INFO: Importing vision/L3/tests/cornertracker/cornertrack/Xilinx_Cornertrack_L3_Test_vitis_hw_u200/build_dir.hw.xilinx_u200_xdma_201830_2/krnl_cornertracker.xclbin Loading: 'vision/L3/tests/cornertracker/cornertrack/Xilinx_Cornertrack_L3_Test_vitis_hw_u200/build_dir.hw.xilinx_u200_xdma_201830_2/krnl_cornertracker.xclbin' *************************************************** Test Case no: 1 Harris Execution Harris: Buffers created Harris: data copied to device Harris: args set Harris kernel called Harris Done Pyrdown Execution Pyrdown: Buffers created Pyrdown: data copied to device Pyrdown Args set pyr in ht = 144 pyr in wd = 256 pyr out ht = 72 pyr out wd = 128 Pyrdown kernel called Pyrdown data copied to host Pyrdown Execution done Pyrdown: Buffers created Pyrdown: data copied to device Pyrdown Args set pyr in ht = 72 pyr in wd = 128 pyr out ht = 36 pyr out wd = 64 Pyrdown kernel called Pyrdown data copied to host Pyrdown Execution done Pyrdown: Buffers created Pyrdown: data copied to device Pyrdown Args set pyr in ht = 36 pyr in wd = 64 pyr out ht = 18 pyr out wd = 32 Pyrdown kernel called Pyrdown data copied to host Pyrdown Execution done Pyrdown: Buffers created Pyrdown: data copied to device Pyrdown Args set pyr in ht = 18 pyr in wd = 32 pyr out ht = 9 pyr out wd = 16 Pyrdown kernel called Pyrdown data copied to host Pyrdown Execution done **********Optical Flow Computation******************* Buffers created *********OF Computation Level =4********* *********OF Computation iteration =0********* Data copied from host to device kernel args set flow_rows =9flow_cols=16flow_in_rows =9flow_in_cols =16 level4calls done0 *********OF Computation iteration =1********* Data copied from host to device kernel args set flow_rows =9flow_cols=16flow_in_rows =9flow_in_cols =16 level4calls done1 *********OF Computation iteration =2********* Data copied from host to device kernel args set flow_rows =9flow_cols=16flow_in_rows =9flow_in_cols =16 level4calls done2 *********OF Computation iteration =3********* Data copied from host to device kernel args set flow_rows =9flow_cols=16flow_in_rows =9flow_in_cols =16 level4calls done3 *********OF Computation iteration =4********* Data copied from host to device kernel args set flow_rows =9flow_cols=16flow_in_rows =9flow_in_cols =16 level4calls done4 Buffers created *********OF Computation Level =3********* *********OF Computation iteration =0********* Data copied from host to device kernel args set flow_rows =18flow_cols=32flow_in_rows =9flow_in_cols =16 level3calls done0 *********OF Computation iteration =1********* Data copied from host to device kernel args set flow_rows =18flow_cols=32flow_in_rows =18flow_in_cols =32 level3calls done1 *********OF Computation iteration =2********* Data copied from host to device kernel args set flow_rows =18flow_cols=32flow_in_rows =18flow_in_cols =32 level3calls done2 *********OF Computation iteration =3********* Data copied from host to device kernel args set flow_rows =18flow_cols=32flow_in_rows =18flow_in_cols =32 level3calls done3 *********OF Computation iteration =4********* Data copied from host to device kernel args set flow_rows =18flow_cols=32flow_in_rows =18flow_in_cols =32 level3calls done4 Buffers created *********OF Computation Level =2********* *********OF Computation iteration =0********* Data copied from host to device kernel args set flow_rows =36flow_cols=64flow_in_rows =18flow_in_cols =32 level2calls done0 *********OF Computation iteration =1********* Data copied from host to device kernel args set flow_rows =36flow_cols=64flow_in_rows =36flow_in_cols =64 level2calls done1 *********OF Computation iteration =2********* Data copied from host to device kernel args set flow_rows =36flow_cols=64flow_in_rows =36flow_in_cols =64 level2calls done2 *********OF Computation iteration =3********* Data copied from host to device kernel args set flow_rows =36flow_cols=64flow_in_rows =36flow_in_cols =64 level2calls done3 *********OF Computation iteration =4********* Data copied from host to device kernel args set flow_rows =36flow_cols=64flow_in_rows =36flow_in_cols =64 level2calls done4 Buffers created *********OF Computation Level =1********* *********OF Computation iteration =0********* Data copied from host to device kernel args set flow_rows =72flow_cols=128flow_in_rows =36flow_in_cols =64 level1calls done0 *********OF Computation iteration =1********* Data copied from host to device kernel args set flow_rows =72flow_cols=128flow_in_rows =72flow_in_cols =128 level1calls done1 *********OF Computation iteration =2********* Data copied from host to device kernel args set flow_rows =72flow_cols=128flow_in_rows =72flow_in_cols =128 level1calls done2 *********OF Computation iteration =3********* Data copied from host to device kernel args set flow_rows =72flow_cols=128flow_in_rows =72flow_in_cols =128 level1calls done3 *********OF Computation iteration =4********* Data copied from host to device kernel args set flow_rows =72flow_cols=128flow_in_rows =72flow_in_cols =128 level1calls done4 Buffers created *********OF Computation Level =0********* *********OF Computation iteration =0********* Data copied from host to device kernel args set flow_rows =144flow_cols=256flow_in_rows =72flow_in_cols =128 level0calls done0 *********OF Computation iteration =1********* Data copied from host to device kernel args set flow_rows =144flow_cols=256flow_in_rows =144flow_in_cols =256 level0calls done1 *********OF Computation iteration =2********* Data copied from host to device kernel args set flow_rows =144flow_cols=256flow_in_rows =144flow_in_cols =256 level0calls done2 *********OF Computation iteration =3********* Data copied from host to device kernel args set flow_rows =144flow_cols=256flow_in_rows =144flow_in_cols =256 level0calls done3 *********OF Computation iteration =4********* Data copied from host to device kernel args set flow_rows =144flow_cols=256flow_in_rows =144flow_in_cols =256 level0calls done4 OF done **********Corner Update Computation******************* kernel args set flow_rows =144flow_cols=256num of corners=47harris_flag=1 Corner Update called Corner Update done OF done Num of corners = 47 *************************************************** *************************************************** Test Case no: 2 Harris Execution Harris: Buffers created Harris: data copied to device Harris: args set Harris kernel called Harris Done Pyrdown Execution Pyrdown: Buffers created Pyrdown: data copied to device Pyrdown Args set pyr in ht = 144 pyr in wd = 256 pyr out ht = 72 pyr out wd = 128 Pyrdown kernel called Pyrdown data copied to host Pyrdown Execution done Pyrdown: Buffers created Pyrdown: data copied to device Pyrdown Args set pyr in ht = 72 pyr in wd = 128 pyr out ht = 36 pyr out wd = 64 Pyrdown kernel called Pyrdown data copied to host Pyrdown Execution done Pyrdown: Buffers created Pyrdown: data copied to device Pyrdown Args set pyr in ht = 36 pyr in wd = 64 pyr out ht = 18 pyr out wd = 32 Pyrdown kernel called Pyrdown data copied to host Pyrdown Execution done Pyrdown: Buffers created Pyrdown: data copied to device Pyrdown Args set pyr in ht = 18 pyr in wd = 32 pyr out ht = 9 pyr out wd = 16 Pyrdown kernel called Pyrdown data copied to host Pyrdown Execution done **********Optical Flow Computation******************* Buffers created *********OF Computation Level =4********* *********OF Computation iteration =0********* Data copied from host to device kernel args set flow_rows =9flow_cols=16flow_in_rows =9flow_in_cols =16 level4calls done0 *********OF Computation iteration =1********* Data copied from host to device kernel args set flow_rows =9flow_cols=16flow_in_rows =9flow_in_cols =16 level4calls done1 *********OF Computation iteration =2********* Data copied from host to device kernel args set flow_rows =9flow_cols=16flow_in_rows =9flow_in_cols =16 level4calls done2 *********OF Computation iteration =3********* Data copied from host to device kernel args set flow_rows =9flow_cols=16flow_in_rows =9flow_in_cols =16 level4calls done3 *********OF Computation iteration =4********* Data copied from host to device kernel args set flow_rows =9flow_cols=16flow_in_rows =9flow_in_cols =16 level4calls done4 Buffers created *********OF Computation Level =3********* *********OF Computation iteration =0********* Data copied from host to device kernel args set flow_rows =18flow_cols=32flow_in_rows =9flow_in_cols =16 level3calls done0 *********OF Computation iteration =1********* Data copied from host to device kernel args set flow_rows =18flow_cols=32flow_in_rows =18flow_in_cols =32 level3calls done1 *********OF Computation iteration =2********* Data copied from host to device kernel args set flow_rows =18flow_cols=32flow_in_rows =18flow_in_cols =32 level3calls done2 *********OF Computation iteration =3********* Data copied from host to device kernel args set flow_rows =18flow_cols=32flow_in_rows =18flow_in_cols =32 level3calls done3 *********OF Computation iteration =4********* Data copied from host to device kernel args set flow_rows =18flow_cols=32flow_in_rows =18flow_in_cols =32 level3calls done4 Buffers created *********OF Computation Level =2********* *********OF Computation iteration =0********* Data copied from host to device kernel args set flow_rows =36flow_cols=64flow_in_rows =18flow_in_cols =32 level2calls done0 *********OF Computation iteration =1********* Data copied from host to device kernel args set flow_rows =36flow_cols=64flow_in_rows =36flow_in_cols =64 level2calls done1 *********OF Computation iteration =2********* Data copied from host to device kernel args set flow_rows =36flow_cols=64flow_in_rows =36flow_in_cols =64 level2calls done2 *********OF Computation iteration =3********* Data copied from host to device kernel args set flow_rows =36flow_cols=64flow_in_rows =36flow_in_cols =64 level2calls done3 *********OF Computation iteration =4********* Data copied from host to device kernel args set flow_rows =36flow_cols=64flow_in_rows =36flow_in_cols =64 level2calls done4 Buffers created *********OF Computation Level =1********* *********OF Computation iteration =0********* Data copied from host to device kernel args set flow_rows =72flow_cols=128flow_in_rows =36flow_in_cols =64 level1calls done0 *********OF Computation iteration =1********* Data copied from host to device kernel args set flow_rows =72flow_cols=128flow_in_rows =72flow_in_cols =128 level1calls done1 *********OF Computation iteration =2********* Data copied from host to device kernel args set flow_rows =72flow_cols=128flow_in_rows =72flow_in_cols =128 level1calls done2 *********OF Computation iteration =3********* Data copied from host to device kernel args set flow_rows =72flow_cols=128flow_in_rows =72flow_in_cols =128 level1calls done3 *********OF Computation iteration =4********* Data copied from host to device kernel args set flow_rows =72flow_cols=128flow_in_rows =72flow_in_cols =128 level1calls done4 Buffers created *********OF Computation Level =0********* *********OF Computation iteration =0********* Data copied from host to device kernel args set flow_rows =144flow_cols=256flow_in_rows =72flow_in_cols =128 level0calls done0 *********OF Computation iteration =1********* Data copied from host to device kernel args set flow_rows =144flow_cols=256flow_in_rows =144flow_in_cols =256 level0calls done1 *********OF Computation iteration =2********* Data copied from host to device kernel args set flow_rows =144flow_cols=256flow_in_rows =144flow_in_cols =256 level0calls done2 *********OF Computation iteration =3********* Data copied from host to device kernel args set flow_rows =144flow_cols=256flow_in_rows =144flow_in_cols =256 level0calls done3 *********OF Computation iteration =4********* Data copied from host to device kernel args set flow_rows =144flow_cols=256flow_in_rows =144flow_in_cols =256 level0calls done4 OF done **********Corner Update Computation******************* kernel args set flow_rows =144flow_cols=256num of corners=47harris_flag=0 Corner Update called Corner Update done OF done Num of corners = 47 *************************************************** ------------------------------------------------------------ Profiling ========= The corner tracker design is validated on Alveo U200 board at 300 MHz frequency. The hardware resource utilizations are listed in the following table. .. table:: Table 1 Hardware resources for Corner Tracker :align: center +-------------+----------------------------+--------------+-----------+----------+--------+ | Name | Dataset | LUT | BRAM | FF | DSP | +=============+============================+==============+===========+==========+========+ |cornertracker| 2x4K images, 8 NPPC | 37547 | 225 | 35089 | 115 | +-------------+----------------------------+--------------+-----------+----------+--------+ |cornertracker| 2xFull HD images, 8 NPPC | 34624 | 129 | 34198 | 115 | +-------------+----------------------------+--------------+-----------+----------+--------+ The performance is shown below .. table:: Table 2 Performance numbers in terms of FPS (Frames Per Second) for 2 consecutive frames for Corner Tracker :align: center +----------------------+--------------+--------------+ | Dataset | FPS(CPU) | FPS(FPGA) | +======================+==============+==============+ | 4k (3840x2160) | 0.53 | 3 | +----------------------+--------------+--------------+ | Full HD(1920x1080) | 2 | 11 | +----------------------+--------------+--------------+ .. toctree:: :maxdepth: 1