Internals of kMeansTaim¶
This document describes the structure and execution of kMeansTrain, implemented as kMeansPredict function.
kMeansTrain fits new centers based native k-means using the existed samples and initial centers provied by user. In order to achieve to accelertion training, DV elements in a sample are input at the same time and used for computing distance with KU centers and updating centers. The static configures are set by template parameters and dynamic by arguments of the API in which dynamic ones should not greater than static ones.
There are Applicable conditions:
1.Dim*Kcluster should less than a fixed value. For example, Dim*Kcluster<=1024*1024 for centers with float stored in URAM and 1024*512 for double on U250.
2.KU and DV should be configured properly due to limitation to URAM. For example,KU*DV=128 when centers are stored in URAM on U250.
3.The dynamic confugures should close to static ones in order to void unuseful computing inside.
Caution
These Applicable conditions.
Benchmark
- The below results are based on:
- all data as double are processed;
- unroll factors DV=8 and KU=16;
- results compared to Spark 2.4.4 and initial centers from Spark to ensure same input;
- Spark 2.4.4 is deployed in a server which has 56 processers(Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz)
Training Resources(Device: U250)¶
D | K | LUT | LUTAsMem | REG | BRAM | URAM | DSP |
---|---|---|---|---|---|---|---|
5811 | 80 | 295110 | 50378 | 371542 | 339 | 248 | 420 |
561 | 144 | 260716 | 26016 | 371344 | 323 | 152 | 420 |
68 | 2000 | 255119 | 24295 | 372487 | 309 | 168 | 425 |
Training Performance(Device: U250)¶
D | K | samples |
on spark(s) |
on spark(s) |
16 threads on spark(s) | 32 threads on spark(s) | 48 threads on spark(s) |
execute(s) |
freq(MHz) |
---|---|---|---|---|---|---|---|---|---|
5811 | 80 | 11463 | 93.489 (3.17X) | 49.857 (1.69X) | 49.860 (1.63X) | 48.001 (1.89X) | 50.875 (1.72X) |
|
202 |
561 | 144 | 7352 |
(5.04X) |
(3.06X) |
(3.06X) |
(2.91X) |
6.190 (2.89X) |
|
269 |
68 | 2000 | 857765 | 547.001 (3.44X) | 173.116 (1.08X) | 170.217 (1.07X) | 161.169 (1.01X) | 166.214 (1.04X) |
|
239 |