GeoSpatial K-nearest Neighbors¶
Overview¶
K Nearest Neighbors(KNN) query is one of the most useful geospatial operations, which targets to solve problems like “What are the 10 nearest taxi trip pickup points of New York Time Square?”. The underlying algorithm is straightforward, including compute the distances between query location and spatial objects, sorting objects by the distances and return the top-k. One typical input format of geospatial data is CSV, thus a csv parser to extract spatial object is necessary in the design.
Kernel Implemention¶
The overall diagram of KNN kernel is shown in the figure below:
Building Blocks:
- CSV Parser(DataAnalytic Library L1 Primitive) CSV Parser: parse input csv file according to schema configurations, output spatial object coordinate (x, y) and index.
- Distance: compute distance between base location and spatial object location; Euclidean distance is applied.
- Sort Top-K(Graph Library L1 Primitive) : sort distance in ascending order and keep top-k objects.
End2End Performance¶
Result Prerequisite:
- Dataset: TLC Trip Record Data 2016-01 yellow taxi trip record. This dataset has 10,906,858 trip records, and column pickup_longtitude and pickup_latitude are used as spatial object inputs.
- Xilinx Device: U.2.
- Geopandas is run on server with Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz and PyGeos package applied.
- E2E Execution time includes read csv file to memory, compute distance, and sort distance.
- Top K = 5.
- Number of CSV Parser PU = 2 or 4: considering the fact that processing ability of single CSV Parser PU is maximum one byte per cycle, which is the bottleneck of design, 2 or 4 CSV Parser PUs are instantiated to enhance the performance.
E2E Execution Time(s) | |
---|---|
KNN(2 CSV Parser) | 4.301 |
KNN(4 CSV Parser) | 2.448 |
Geopandas(1 thread) | 36.097 |
Resource Utilization¶
Device U.2 | LUT | LUTAsMem | REG | BRAM | URAM | DSP |
---|---|---|---|---|---|---|
KNN(2 CSV Parser) | 25458 | 2350 | 27262 | 25 | 0 | 110 |
6.58% | 1.55% | 3.21% | 3.56% | 0 | 5.62% | |
KNN(4 CSV Parser) | 43458 | 3606 | 44255 | 43 | 0 | 208 |
11.2% | 2.37% | 5.21% | 6.12% | 0 | 10.6% |