NOTE:
In this section, the traveling salesperson problem (TSP) design is implemented with 4 parallel memory lookups to fetch distances.
2021.1 Vitis™ - The Traveling Salesperson Problem - TutorialSee Vitis™ Development Environment on xilinx.com |
Load the project into Vitis HLS¶
Open a terminal and navigate to the build
directory. Launch the following command which will open the graphical interface of Vitis HLS and configure the project based on the settings included in the hls_opt.tcl Tcl file:
user@server:~$ cd ./build
user@server:~$ vitis_hls -p hls_opt.tcl &
Once the tool comes up, on the left-hand side, locate the Explorer pane, expand proj->Source
and double-click on tsp_opt.cpp
to inspect the source code.
The testbench is found below in proj->TestBench
in the file is tsp_TB.cpp
.
Review Code Changes¶
There are now 4 arrays for the distances:
uint16_t distances_0[N][N];
uint16_t distances_1[N][N];
uint16_t distances_2[N][N];
uint16_t distances_3[N][N];
The incoming distance data points are still read one at a time, but they are copied into all 4 memories:
loop_distances: for (int i = 0; i < N*N; ++i)
{
uint16_t val;
streamDistances >> val;
distances_0[i/N][i%N] = val;
distances_1[i/N][i%N] = val;
distances_2[i/N][i%N] = val;
distances_3[i/N][i%N] = val;
}
The loop_compute
main loop continuously increments by 4 and distributes the 4 values to copies of the compute
function. Each evaluates a route:
loop_compute: for( unsigned long int i_ = 0; i_ < factorialN; i_ += 4 )
{
#pragma HLS pipeline II=1
candidate0 = std::min(candidate0, compute(i_+0, distances_0));
candidate1 = std::min(candidate1, compute(i_+1, distances_1));
candidate2 = std::min(candidate2, compute(i_+2, distances_2));
candidate3 = std::min(candidate3, compute(i_+3, distances_3));
}
Final determination of the shortest distance:
// Determine shortest between the 4 candidates
shortestDistance = std::min({ candidate0, candidate1,
candidate2, candidate3 });
Running C-simulation and C-synthesis¶
First you will run C simulation to confirm the optimized design works as expected. Edit the tsp.h
file to make sure the number of cities is small (N=5
) for faster simulation run times, and Run C Simulation.
After reviewing the results of simulation, increase the number of cities again (N=13
) in the tsp.h
file for C synthesis so that you can compare results with the original design. Run C Synthesis.
The following figure shows the C synthesis report in the Vitis HLS GUI (the Performance and Resource Estimates section):
You will notice that:
The latency for the
tsp
function is now less than a secondThe loop distance trip count is unchanged as expected since the input data is same
The tripcount for the main loop (
loop_compute
) is now a fourth of factorial 12 (12!/4) thanks to the parallel execution of thecompute
functionThe new
Loop 3
originates from the final std::min call that returns the smallest of the 4 results obtained