# Internal Design of Strongly Connected Component¶

## Overview¶

In the mathematical theory of directed graphs, a graph is said to be strongly connected if every vertex is reachable from every other vertex. The strongly connected components of an arbitrary directed graph form a partition into subgraphs that are themselves strongly connected. (from wikipedia) The API will compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.

## Algorithm¶

The color-based algorithm of strongly connected components is given as following:

procedure StronglyConnectedComponent(graph_CSR)
graph_CSC := csr2csc(graph_CSR)

for each vertex v in graph
result(v) := -1
color(v) := -1

while all vertexs have been labeled
FW-coloring(graph_CSR, FW-Queue, color)
BW-labeling(graph_CSR, graph_CSC, BW-Queue, color, result)
end while

return result

sub procedure FW-coloring(graph_CSR, FW-Queue, color)
rootNode := findNewRoot(FW-Queue, color)
push rootNode into FW-Queue
while FW-Queue is not empty
u := pop FW-Queue
for each edge(u, v) in graph_CSR
if color(v) == -1 then
color(v) = u
push v into FW-Queue
end if
end for
end while

sub procedure BW-labeling(graph_CSR, graph_CSC, FW-Queue, BW-Queue, color, result)
newRootNode := rootNode
result(rootNode) := rootNode
for v in this color region
if indegree(v) == 0 or outdegree(v) == 0 then
result(v) := v
end if
end for

while all vertexs in this color region have been labeled
push newRootNode into BW-Queue
while BW-Queue is not empty
u := pop BW-Queue
for each edge(u, v) in graph_CSC
if color(v) == newRootNode and result(v) == -1 then
result(v) = newRootNode
push v into BW-Queue
end if
end for
end while

if there is one more vertex which hasn't been labeled
Re-color those vertexs which aren't labeled in last BW-BFS into rootNode
newRootNode := FW-BFS(graph_CSR, BW-Queue, color)
FW-BFS(graph_CSR, BW-Queue, color)
end if
end while


Here, for color-based alogrithm, each backward label process must be started from the vertex whose color is equal to its vertex ID. In other words, the starting vertex must own the lowest vertex ID in the following SCC. As a result, we use one single FW-BFS process to find the starting vertex before each BW-label. And another FW-BFS is required to re-color using the true starting vertex if the first FW-BFS is started from one vertex with greater vertex ID.

## Interface¶

The input should be a directed graph in compressed sparse row (CSR) format. The result will return a vertex list with each vertex value containing the lowest vertex id in the SCC.

## Implemention¶

The detail algorithm implemention is shown as the figure below:

In the SCC kernel design, FW-color and BW-Label can be dataflow by sharing some common buffers with multiple independent AXI masters. As a result, these two processes will be run at the same time to obtain a better performance.

## Profiling and Benchmarks¶

The kernel is built by Vivado tools and benchmard in U250 FPGA card at 275MHz. The hardware resource utilization and benchmark performance are listed in the table below.

 Name LUT BRAM URAM DSP Platform 104112 165 0 4 scc_kernel 164311 523.5 110 6 Total 268423 (16%) 688.5 (26%) 110 (9%) 10 (0%)
 Datasets Vertex Edges Number of SCC Iteration number in Spark FPGA Time Spark (4 threads) Spark (8 threads) Spark (16 threads) Spark (32 threads) Spark Time Speed up Spark Time Speed up Spark Time Speed up Spark Time Speed up cit-Patents 3774768 16518948 3774768 6 20711 52137 2.52 60517 2.92 51390 2.48 39939 1.93 hollywood 1139905 57515616 1139905 6 9780 75681 7.74 45935 4.70 39595 4.05 29665 3.03 soc-LiveJournal1 4847571 68993773 971232 6 39952 424444 10.62 304755 7.63 244916 6.13 231465 5.79 ljournal-2008 5363260 79023142 1119171 16 34840 540199 15.51 458633 13.16 378304 10.86 402120 11.54 GEOMEAN 23043 173431 7.53X 140397 6.09X 117178 5.09X 102476 4.45X

Note

1. Spark running on platform with Intel(R) Xeon(R) CPU E5-2690 v4 @2.600GHz, 56 Threads (2 Sockets, 14 Core(s) per socket, 2 Thread(s) per core)
2. Time unit: ms.