Singular Value Decomposition for symmetric matrix (GESVDJ)¶
The hardware resources and performance for double and float datatype gesvdj are listed in Table 2 and Table 3.
Matrix Size | Unroll | URAM | BRAM | DSP | Register | LUT | Kernel time (ms) | Frequency(MHz) |
8x8 | 4 | 20 | 6 | 216 | 46245 | 39365 | 0.0711 | 250 |
512x512 | 8 | 128 | 333 | 408 | 120837 | 115121 | 2100.833 | 208.3 |
Matrix Size | Unroll | URAM | BRAM | DSP | Register | LUT | Kernel time (ms) | Frequency(MHz) |
8x8 | 4 | 20 | 4 | 114 | 23647 | 18529 | 0.0533 | 250 |
512x512 | 8 | 128 | 307 | 210 | 65569 | 65003 | 1687.274 | 208.3 |
Note
Board: Xilinx Alveo U250 Data Center Accelerator Card
The accuracy of GESVDJ implementation has been verified with Lapack dgesvd (QR based SVD) and dgesvj (Jacobi SVD) functions.
Caution
The unroll factor is limited by 2 factors, the matrix size and URAM port. The maximum unroll factor should be less than half of matrix size, and \(2 \times {Unroll}^{2}\) should also be less than available URAM on board. Besides, unroll factor can only be the factorization of 2