Debug Profile

This is simple example of vector addition and printing profile data (wall clock time taken between start and stop). It also dump a waveform file which can be reloaded to vivado to see the waveform. Run command ‘vivado -source ./scripts/open_waveform.tcl -tclargs <device_name>-<kernel_name>.<target>.<device_name>.wdb’ to launch waveform viewer. User can also update batch to gui in xrt.ini file to see the live waveform while running application. The example also demonstrates the use of hls::print to print a format string/int/double argument to standard output, and to the simulation log in cosim and HW_EMU.

KEY CONCEPTS: Use of Profile API, Waveform Dumping and loading

KEYWORDS: debug_mode=gui/batch, user_range, user_event, hls::print

The Vitis development environment can generate a waveform view and launch a live waveform viewer when running hardware emulation. It displays in-depth details on the emulation results at system level, compute unit level, and at function level. The details include data transfers between the kernel and global memory, data flow via inter-kernel pipes as well as data flow via intrakernel pipes. They provide many insights into the performance bottleneck from the system level down to individual function call to help developers optimize their applications.

Example uses a simple vector addition kernel to demonstrate the debugging information that can be viewed in the waveform.

xrt.ini file is used to launch the waveform. Waveform can be viewed at runtime by launching GUI with the following command in this file.

[Emulation]
debug_mode=GUI

Waveform can also be generated by using .wdb file generated during hardware emulation which can be opened in Vivado with the commands written in the script provided under scripts/open_waveform.tcl. For this case, we need to add the following flags in the xrt.ini file:

[Emulation]
debug_mode=batch

Waveforms are helpful to view data transfers to memory from host as well as data transfer from each AXI Master ports. Another feature which waveform viewer provides is the CU Stalls. The stall bus compiles all of the lowest level stall signals and reports the percentage that are stalling at any point in time. This provides a factor of how much of the kernel is stalling at any point in the simulation and user can optimize the design to improve the utility of hardware based on these stall signals.

If the user wants to record profiling information for arbitrary sections of his code, the following 2 features can be used -

  1. user_range - Profiles and captures the data in the specified range

  2. user_event - Marks the event in the timeliene trace

The user can also use the hls::print function to print a format string/int/double argument to standard output, and to the simulation log in cosim and HW_EMU. It can be used to trace the order in which code blocks are executed across complex control and concurrent execution (e.g. in dataflow) or trace the values of some selected variables.

When used in this simple example:

#include "hls_print.h"
...
    hls::print("Number of elements : %d\n", length_r);
    hls::print("Buffer size : %d\n", BUFFER_SIZE);

...

it prints the “Number of elements” and “buffer size” for C simulation, SW emulation, RTL cosimulation and HW emulation. It is ignored in HW flow and does not impact the kernel functionality.

EXCLUDED PLATFORMS:

  • All NoDMA Platforms, i.e u50 nodma etc

DESIGN FILES

Application code is located in the src directory. Accelerator binary files will be compiled to the xclbin directory. The xclbin directory is required by the Makefile and its contents will be filled during compilation. A listing of all the files in this example is shown below

src/host.cpp
src/host.h
src/vadd.cpp

Access these files in the github repo by clicking here.

COMMAND LINE ARGUMENTS

Once the environment has been configured, the application can be executed by

./debug_profile <vadd XCLBIN>