Port Width Widening =================== This example shows how HLS introduces the capability of resizing the port width of the kernel interface ports for better resource utilization maintaining the performance. **KEY CONCEPTS:** `Interface port width auto widening <https://docs.xilinx.com/r/en-US/ug1399-vitis-hls/Automatic-Port-Width-Resizing>`__ **KEYWORDS:** m_axi_max_widen_bitwidth This example introduces the capability of how Vitis HLS can configure the size of kernel interface ports. A few rules must be kept in mind by the user - 1. Pragma option value defined by user has higher priority than TCL. 2. The max_widen_bitwidth value should be in range [0, 1024], and it must be either 0 or a power of 2. If not satisfied, this value setting will be ignored. 3. If some ports are bundled together, one bundle name can have only one max_widen_bitwidth value. Therefore, if each port of a bundle has a different width, the maximum width under the bundle will be taken as the width for each of the ports. Vitis kernel can have m_axi interface which will be used by host application to configure the kernel. We have 5 kernels here each having the port width set in a a different way - 1. KERNEL 1 - Default case (no explict settings) - By default, HLS gives single M_AXI interface to access all pointer arguments (i.e. a,b and res here) and default width would be the maximum width datatype (i.e. 64bit here due to uint64_t). .. code:: cpp void dot_product_1(const uint32_t *a, const uint32_t *b, uint64_t *res, const int size, const int reps){ loop_reps: for (int i = 0; i < reps; i++) { dot_product: for (int j = 0; j < size; j++) { res[j] = a[j] * b[j]; } } } 2. KERNEL 2 - Auto port width widening when pipeline loop is fixed bound (i.e. DATA_WIDTH), HLS does auto port width widening when pipeline loop is fixed bound. Here pipeline loop dot_product_inner has fixed iteration of DATA_WIDTH, as a result, HLS is widening M_AXI port width to 512bit (Maximum). .. code:: cpp #define DATA_WIDTH 16 void dot_product_2(const uint32_t *a, const uint32_t *b, uint64_t *res, const int size, const int reps){ dot_product_outer: for (int j = 0; j < size; j += DATA_WIDTH) { dot_product_inner: for (int k = 0; k < DATA_WIDTH; k++) { res[j + k] = a[j + k] * b[j + k]; } } } 3. KERNEL 3 - pragmas specifying multiple bundles to infer multiple M_AXI interfaces. Here we are providing gmem0 to pointer a (Read) and res (write) and gmem1 to pointer b(read). .. code:: cpp #define DATA_WIDTH 16 void dot_product_3(const uint32_t *a, const uint32_t *b, uint64_t *res, const int size, const int reps) { #pragma HLS INTERFACE m_axi port=a bundle=gmem0 #pragma HLS INTERFACE m_axi port=b bundle=gmem1 #pragma HLS INTERFACE m_axi port=res bundle=gmem0 dot_product_outer: for (int j = 0; j < size; j += DATA_WIDTH) { dot_product_inner: for (int k = 0; k < DATA_WIDTH; k++) { res[j + k] = a[j + k] * b[j + k]; } } } 4. KERNEL 4 - Along with pragma in kernel, user can explicitly provide port width in tcl file (hls_config.tcl) as specified below: .. code:: cpp config_interface -m_axi_max_widen_bitwidth 512 The interface size setting need to be specified in hls_config.tcl file. We included this tcl file in our krnl_dot_product_4.cfg file and by using the ``--config`` tag in the kernel compile stage we specify the m_axi interface size. Following is the content of krnl_dot_product_4.cfg file .. code:: cpp [hls] pre_tcl=hls_config.tcl 5. KERNEL 5 - Interface pragma based port width allocation to each bundle. User can directly specifying portwidth to each M_AXI ports. Here user is setting 512 bit width to gmem0 and 256 bitwidth to gmem1. .. code:: cpp void dot_product_5(const uint32_t *a, const uint32_t *b, uint64_t *res, const int size, const int reps) { #pragma HLS INTERFACE m_axi port=a bundle=gmem0 max_widen_bitwidth=512 #pragma HLS INTERFACE m_axi port=b bundle=gmem1 max_widen_bitwidth=256 #pragma HLS INTERFACE m_axi port=res bundle=gmem0 Below are the resource numbers while running the design on U200 platform: ============= =========== =========== ============= ============ ==== ==== === Design port_size_a port_size_b port_size_res Bundle_Count BRAM LUT DSP ============= =========== =========== ============= ============ ==== ==== === dot_product_1 64 64 64 1 2 2237 3 dot_product_2 512 512 512 1 15 3665 48 dot_product_3 512 512 512 2 23 5319 48 dot_product_4 512 512 512 2 23 5316 48 dot_product_5 512 256 512 2 19 4939 48 ============= =========== =========== ============= ============ ==== ==== === Following is the real log reported while running the design on U200 platform: ========================== ===================== Kernel(1000000 iterations) Wall-Clock Time (sec) ========================== ===================== dot_product_1 66.8994 dot_product_2 2.57683 dot_product_3 1.14736 dot_product_4 1.14755 dot_product_5 1.26024 ========================== ===================== **EXCLUDED PLATFORMS:** - All NoDMA Platforms, i.e u50 nodma etc - Samsung U.2 SmartSSD - Versal VCK190 - All ZCU102 Base Platforms DESIGN FILES ------------ Application code is located in the src directory. Accelerator binary files will be compiled to the xclbin directory. The xclbin directory is required by the Makefile and its contents will be filled during compilation. A listing of all the files in this example is shown below :: src/dot_product_1.cpp src/dot_product_2.cpp src/dot_product_3.cpp src/dot_product_4.cpp src/dot_product_5.cpp src/host.cpp Access these files in the github repo by `clicking here <https://github.com/Xilinx/Vitis_Accel_Examples/tree/master/cpp_kernels/port_width_widening>`__. COMMAND LINE ARGUMENTS ---------------------- Once the environment has been configured, the application can be executed by :: ./port_width_widening <krnl_port_widen XCLBIN>