2020.1 Vitis™ - Algorithm Acceleration

See Vitis™ Development Environment on xilinx.com

In this section… 1 Introductory reference module in which we run the CPU version of the algorithm in ./cpu_src 4 Alveo U50 modules, located under the ./docs directory Instructions in local readme files for each module

Introduction — CPU Run: The C++ implementation of the algorithm
- Run a C++ non-accelerated version of Cholesky algorithm
Module 1: Setting up the design and establish a performance baseline
- Understand the host OpenCL APIs that help connect to the kernel implemented onto the Xilinx device
- Verify results through emulation both at the software level (sw_emu) and the hardware level (hw_emu)
- Evaluate the performance by visualizing the timeline trace with Vitis Analyzer
- Launch Vitis HLS to review the kernel optimizations
Module 2: This version of the code explicitely applies the PIPELINE and INTERFACE directive
- Learn about these pragmas and their impact on designs
Module 3: Change double data types to float
- Run hardware emulation and then Vitis Analyzer and Vitis HLS
- Measure the impact on physical resources required to implement the design and performance
Module 4: Back to using double , the task parallelism pragma is applied to improve results
- Re-arrange code to enable the task parallelism optimization DATAFLOW pragma
- Evaluate the performance improvement with Vitis Analyzer
- Use Vitis HLS to confirm the new micro-architecture created by dataflow
- Generate the binary (xclbin) to program the card and measure the actual performance