Internal Design of PageRank¶
PageRank (PR) is an algorithm used by Google Search to rank web pages in their search engine results. PageRank is a way of measuring the importance of website pages. PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites. Currently, PageRank is not the only algorithm used by Google to order search results, but it is the first algorithm that was used by the companies, and it is the best known.
PageRank algorithm implementation:
\(A, B, C ,D...\) are different vertex. \(PR\) is the pagerank value of vertex, \(Out\) represents the out degree of vertex and \(\alpha\) is the damping factor, normally equals to 0.85
The algorithm’s pseudocode is as follows
for each edge (u, v) in graph // calculate du degree degree(v) += 1 for each node u in graph // initiate DDR const(u) := (1- alpha) / degree(u) PR_old(u) := 1 / total vertex number while norm(PR_old - PR_new) > tolerance // iterative add for each vertex u in graph PR_new(u) := alpha for each vertex v point to u PR_new(u) += const(v)*PR_old(v) return PR_new
The input matrix should ensure that the following conditions hold:
- directed graph
- No self edges
- No duplicate edges
- compressed sparse column (CSC) format
Note that this is not the “normalized” PageRank:
- The results are the same as some third-party graph databases, ex. Tigergraph
- The results are the same as some third-party graph databases after normalized of order 1, ex. Spark
- In the current version, the weighted PageRank algorithm is implemented by default.
- For the input unweighted graph, the user still needs to initialize the weight buffer manually to make the kernel work normally, as shown in the ./tests/host codes.
The algorithm implemention is shown as the figure below:
Figure 1 : PageRank calculate degree architecture on FPGA
Figure 2 : PageRank initiation module architecture on FPGA
Figure 3 : PageRank Adder architecture on FPGA
Figure 4 : PageRank calConvergence architecture on FPGA
As we can see from the figure:
- Module calculate degree: first get the vertex node’s outdegree and keep them in one DDR buffer.
- Module initiation: initiate PR DDR buffers and constant value buffer.
- Module Adder: calculate Sparse matrix multiplification.
- Module calConvergence: calculate convergence of pagerank iteration.
The hardware resource utilizations are listed in the following table. Different tool versions may result slightly different resource.
Table 1 : Hardware resources for PageRank with small cache
Table 2 : Hardware resources for PageRank with cache
With the increase of cache depth, the acceleration ratio increases obviously, but due to the use of a lot of URAM, the frequency will drop. So the adviced cache depth is 32K for 1SLR of Alveo U250.
Table 3 : Comparison between CPU SPARK and FPGA VITIS_GRAPH
|datasets||Vertex||Edges||FPGA time cache 1||FPGA time cache 32K||Spark (4 threads)||Spark (8 threads)||Spark (16 threads)||Spark (32 threads)|
|Spark time||speedup Cache 1||speedup Cache 32K||Spark time||speedup Cache 1||speedup Cache 32K||Spark time||speedup Cache 1||speedup Cache 32K||Spark time||speedup Cache 1||speedup Cache 32K|