Kernel Templates in xf::data_analytics::clustering
¶
kMeansTrain¶
#include "xf_data_analytics/clustering/kmeansTrain.hpp"
template < typename DT, int Dim, int Kcluster, int KU, int DV = 128 / KU > void kMeansTrain ( ap_uint <512>* data, ap_uint <512>* kcenters )
k-means is a clustering algorithm that aims to partition n samples into k clusters in which each sample belongs to the cluster with the nearest mean. The implementation is based on “native k-means”(also referred to as Lloyd’s algorithm). The implemenation aims to change computational complexity O(Nsample * Kcluster * Dim * maxIter) to O(Nsample* (Kcluster/KU)*(Dim/DV)*maxIter) by accelerating calculating distances.Athough more speedup are achieved to as KU*DV grows in theory,KU and DV should be configured properly because the both effect on storing centers on chip. The input data contains : 1) dynamic configures in data[0],including the number of samples,the number of dimensions,the number of clusters,the maximum number of iterations,the distance threshold used for determining whether the iteration is converged. 2) initial centers, which are provided by host and compressed into many 512-bit packages. 3) smaples used for training,which are also compressed. kcenters is used for output best centers only.
Parameters:
DT | data type, supporting float and double |
Dim | the maximum number of dimensions,dynamic number of dimension should be not greater than the maximum. |
Kcluster | the maximum number of cluster,dynamic number of cluster should be not greater than the maximum. |
KU | unroll factor of Kcluster, KU centers are took part in calculating distances concurrently with one sample. After Kcluster/KU+1 times at most, ouput the minimum distance of a sample and Kcluster centers. |
DV | unroll factor of Dim, DV elements in a center are took part in calculating distances concurrently with one sample. |
data | input data from host |
kcenters | the output best centers |
Kernel Templates xf::data_analytics::regression
¶
linearLeastSquareRegressionSGDTrain¶
#include "xf_data_analytics/regression/linearRegressionTrain.hpp"
template < int WAxi, int D, int Depth, int BurstLen > void linearLeastSquareRegressionSGDTrain ( ap_uint <WAxi>* input, ap_uint <WAxi>* output )
linear least square regression training using SGD framework
Parameters:
WAxi | AXI interface width to load training data. |
D | Number of features that processed each cycle |
DDepth | DDepth * D is max feature numbers supported. |
BurstLen | |
Length | of burst read. |
input | |
training | configs and training data |
output | |
training | result of weight and intercept |
ridgeRegressionSGDTrain¶
#include "xf_data_analytics/regression/linearRegressionTrain.hpp"
template < int WAxi, int D, int Depth, int BurstLen > void ridgeRegressionSGDTrain ( ap_uint <WAxi>* input, ap_uint <WAxi>* output )
ridge regression training using SGD framework
Parameters:
WAxi | AXI interface width to load training data. |
D | Number of features that processed each cycle |
DDepth | DDepth * D is max feature numbers supported. |
BurstLen | |
Length | of burst read. |
input | |
training | configs and training data |
output | |
training | result of weight and intercept |
LASSORegressionSGDTrain¶
#include "xf_data_analytics/regression/linearRegressionTrain.hpp"
template < int WAxi, int D, int Depth, int BurstLen > void LASSORegressionSGDTrain ( ap_uint <WAxi>* input, ap_uint <WAxi>* output )
lasso regression training using SGD framework
Parameters:
WAxi | AXI interface width to load training data. |
D | Number of features that processed each cycle |
DDepth | DDepth * D is max feature numbers supported. |
BurstLen | |
Length | of burst read. |
input | |
training | configs and training data |
output | |
training | result of weight and intercept |
Kernel Templates in xf::data_analytics::text
¶
reEngine¶
#include "xf_data_analytics/text/re_engine.hpp"
template < int PU_NM, int INSTR_DEPTH, int CCLASS_NM, int CPGP_NM, int MSG_LEN, int STACK_SIZE > void reEngine ( ap_uint <64>* cfg_in_buff, ap_uint <64>* msg_in_buff, ap_uint <16>* len_in_buff, ap_uint <32>* out_buff )
The reEngine executes the input messages with configured RE pattern. The pattern is pre-compiled to a list of instructions and is provied by user through the cfg_buff. Therefore, the reEngine which is based on the hardware regex-VM is dynamically configurable. User could improve the throughput by increasing the template parameter PU_NM to accelerate the matching process by sacrificing the on-board resources.
Parameters:
PU_NM | Number of processing units in parallel. |
INSTR_DEPTH | The depth of instruction buffer in 64-bit. |
CCLASS_NM | Supported max number of character classes in regular expression pattern. |
CPGP_NM | Supported max number of capturing group in regular expression pattern. |
MSG_LEN | Supported max length for each message in 8-byte. |
STACK_SIZE | Max size of internal stack buffer in regex-VM. |
cfg_in_buff | Input configurations which provides a list of instructions, number of instructions, number of character classes, number of capturing groups, and bit set map. |
msg_in_buff | Input messages to be matched by the regular expression. |
len_in_buff | input length for each message. |
out_buff | Output match results. |