API Functions of xf::data_analytics

class linearLeastSquareRegressionPredict
class LASSORegressionPredict
class ridgeRegressionPredict
class logisticRegressionPredict
class SGDFramework


#include "xf_DataAnalytics/classification/decision_tree_predict.hpp"
template <
    typename MType,
    unsigned int WD,
    unsigned int MAX_FEA_NUM,
    unsigned int MAX_TREE_DEPTH = 20,
    unsigned MAX_CAT_BITS = 8
void decisionTreePredict (
    hls::stream <ap_uint <WD>> dstrm_batch [MAX_FEA_NUM],
    hls::stream <bool>& estrm_batch,
    hls::stream <ap_uint <512>>& treeStrm,
    hls::stream <bool>& treeTag,
    hls::stream <ap_uint <MAX_CAT_BITS>>& predictionsStrm,
    hls::stream <bool>& predictionsTag

decisionTreePredict, Top function of Decision Tree Predict.

This function first loads decision tree (the corresponding function : getTree) from treeStrm Then, read sample one by one from dstrm_batch, and output its category id into predictionsStrm streams

Note that the treeStrm is a 512-bit stream, and each 512 bits include two nodes. In each 512-bit confirm the range(0,71) is node[i].nodeInfo and range(256,327) is node[i+1].nodeInfo the range(192,255) is node[i].threshold and range(448,511) is node[i+1].threshold For detailed info of Node struct, can refer “decision_tree.hpp” Samples in input sample stream should be converted into ap_uint<WD> from MType


MType The data type of sample
WD The width of data type MType, can get by sizeof(MType)
MAX_FEA_NUM The max feature num function can support
MAX_TREE_DEPTH The max tree depth function can support
MAX_CAT_BITS The category max bit number
dstrm_batch Input data streams of ap_uint<WD>
estrm_batch End flag stream for input data
treeStrm Decision tree streams
treeTag End flag stream for decision tree nodes
predictionsStrm Output data streams
predictionsTagStrm End flag stream for output


#include "xf_DataAnalytics/classification/decision_tree_train.hpp"
template <
    int _BurstLen = 32,
    int _WAxi = 512,
    int _WData = 64
void axiVarColToStreams (
    ap_uint <_WAxi>* ddr,
    const ap_uint <32> offset,
    const ap_uint <32> rows,
    ap_uint <32> cols,
    hls::stream <ap_uint <_WData>> dataStrm [_WAxi/_WData],
    hls::stream <bool>& eDataStrm

Loading table from AXI master to stream. Table should be row based storage of identical datawidth.


_BurstLen burst length of AXI buffer, default is 32.
_WAxi width of AXI port, must be multiple of datawidth, default is 512.
_WData datawith, default is 64.
ddr input AXI port
offset offset(in _WAxi bits) to load table.
rows Row number of table
cols Column number of table Output streams of _WAxi/_WData channels end flag of output stream.


#include "xf_DataAnalytics/classification/naive_bayes.hpp"
template <
    int DT_WIDTH = 32,
    int WL = 3,
    typename DT = unsigned int
void naiveBayesTrain (
    const int num_of_class,
    const int num_of_term,
    hls::stream <ap_uint <64>> i_data_strm [1<< WL],
    hls::stream <bool> i_e_strm [1<< WL],
    hls::stream <int>& o_terms_strm,
    hls::stream <ap_uint <64>> o_data0_strm [1<< WL],
    hls::stream <ap_uint <64>> o_data1_strm [1<< WL]

naiveBayesTrain, top function of multinomial Naive Bayes Training.

This function will firstly load train dataset from the i_data_strm, then counte the frequency for each hit term. After scaning all sample, the likehood probability matrix and prior probability will be output from two independent stream


DT_WIDTH the width of type DT, in bits
WL the width of bit to enable dispatcher, only 3 is supported so far
DT the data type of internal counter for terms, can be 32/64-bit integer, float or double
num_of_class the number of class in sample dataset, should be exactly same with real dataset
num_of_term the number of terms, must be larger than the number of feature, and num_of_class * num_of_term <= (1 << (20-WL)) must be satisfied.
i_data_strm input data stream of ap_uint<64> in multiple channel
i_e_strm end flag stream for each input data channel
o_terms_strm the output number of statistic feature
o_data0_strm the output likehood matrix
o_data1_strm the output prior probablity vector


#include "xf_DataAnalytics/classification/naive_bayes.hpp"
template <
    int CH_NM,
    int GRP_NM
void naiveBayesPredict (
    const int num_of_class,
    const int num_of_term,
    hls::stream <ap_uint <64>>& i_theta_strm,
    hls::stream <ap_uint <64>>& i_prior_strm,
    hls::stream <ap_uint <32>> i_data_strm [CH_NM],
    hls::stream <bool>& i_e_strm,
    hls::stream <ap_uint <10>>& o_class_strm,
    hls::stream <bool>& o_e_strm

naiveBayesPredict, top function of multinomial Naive Bayes Prediction

The function will firstly load the train model into on-chip memory, and calculate the classfication results for each sample using argmax function.


CH_NM the number of channel for input sample data, should be power of 2
GRP_NM the unroll factor for handling the classes simultaneously, must be power of 2 in 1~256
num_of_term the number of class, should be exactly same with the input dataset
num_of_term the number of feature, should be exactly same with the input dataset
i_theta_strm the input likehood probability stream, [num_of_class][num_of_term]
i_prior_strm the input prior probability stream, [num_of_class]
i_data_strm the input of test data stream
i_e_strm end flag stream for i_data_strm
o_class_strm the prediction result for each input sample
o_e_strm end flag stream for o_class_strm


#include "xf_DataAnalytics/classification/svm_predict.hpp"
template <
    typename MType,
    unsigned WD,
    unsigned StreamN,
    unsigned SampleDepth
void svmPredict (
    const int cols,
    hls::stream <MType> sample_strm [StreamN],
    hls::stream <bool>& e_sample_strm,
    hls::stream <ap_uint <512>>& weight_strm,
    hls::stream <bool>& eTag,
    hls::stream <ap_uint <1>>& predictionsStrm,
    hls::stream <bool>& predictionsTag

svmPredict, Top function of svm Predict.

This function first loads weight (the corresponding function : getWeight) from weight_strm Then, read sample from sample_strm, and output its classification id into predictionsStrm streams


MType The data type of sample
WD The width of data type MType, can get by sizeof(MType)
StreamN The stream number of input sample stream vector
SampleDepth stream depth number of one input sample
cols colum number of input data sample
sample_strm Input data streams of MType
e_sample_strm End flag stream for input data
weight_strm weight streams
eTag End flag stream for weight streams
predictionsStrm Output data streams
predictionsTagStrm End flag stream for output


#include "xf_DataAnalytics/clustering/kmeansPredict.hpp"
template <
    typename DT,
    int Dim,
    int Kcluster,
    int uramDepth,
    int KU,
    int DV
void kMeansPredict (
    hls::stream <ap_uint <sizeof (DT)*8>> sampleStrm [DV],
    hls::stream <bool>& endSampleStrm,
    ap_uint <sizeof (DT)*8*DV> centers [KU][uramDepth],
    const int dims,
    const int kcluster,
    hls::stream <ap_uint <32>>& tagStrm,
    hls::stream <bool>& endTagStrm

kMeansPredict predicts cluster index for each sample. In order to achive to acceleration, please make sure partition 1-dim of centers.


DT data type, supporting float and double
Dim the maximum number of dimensions,dynamic number of dimension should be not greater than the maximum.
Kcluster the maximum number of cluster,dynamic number of cluster should be not greater than the maximum.

the depth of uram where centers are stored. uramDepth should be not less than ceiling(Kcluster/KU)

  • ceil(Dim/DV)
KU unroll factor of Kcluster, KU centers are took part in calculating distances concurrently with one sample. After Kcluster/KU+1 times at most, ouput the minimum distance of a sample and Kcluster centers.
DV unroll factor of Dim, DV elements in a center are took part in calculating distances concurrently with one sample.
sampleStrm input sample streams, a sample needs ceiling(dims/DV) times to read.
endSampleStrm the end flag of sample stream.
centers an array stored centers, user should partition dim=1 in its defination.
dims the number of dimensions.
kcluster the number of clusters.
tagStrm tag stream, label a cluster ID for each sample.
endTagStrm end flag of tag stream.


#include "xf_DataAnalytics/regression/decision_tree_predict.hpp"
template <
    typename MType,
    unsigned int WD,
    unsigned int MAX_FEA_NUM,
    unsigned int MAX_TREE_DEPTH = 10
void decisionTreePredict (
    hls::stream <ap_uint <WD>> dstrm_batch [MAX_FEA_NUM],
    hls::stream <bool>& estrm_batch,
    hls::stream <ap_uint <512>>& treeStrm,
    hls::stream <bool>& treeTag,
    hls::stream <MType>& predictionsStrm,
    hls::stream <bool>& predictionsTag

decisionTreePredict, Top function of Decision Tree Predict.

This function first loads decision tree (the corresponding function : getTree) from treeStrm Then, read sample one by one from dstrm_batch, and output its category id into predictionsStrm streams

Note that the treeStrm is a 512-bit stream, and each 512 bits include two nodes. In each 512-bit confirm the range(0,71) is node[i].nodeInfo and range(256,327) is node[i+1].nodeInfo the range(72,135) is node[i].regValue and range(328,391) is node[i+1].regValue the range(192,255) is node[i].threshold and range(448,511) is node[i+1].threshold For detailed info of NodeR struct, can refer “decision_tree_L1.hpp” Samples in input sample stream should be converted into ap_uint<WD> from MType


MType The data type of sample
WD The width of data type MType, can get by sizeof(MType)
MAX_FEA_NUM The max feature num function can support
MAX_TREE_DEPTH The max tree depth function can support
dstrm_batch Input data streams of ap_uint<WD>
estrm_batch End flag stream for input data
treeStrm Decision tree streams
treeTag End flag stream for decision tree nodes
predictionsStrm Output regression value streams
predictionsTagStrm End flag stream for output

template class xf::data_analytics::classification::logisticRegressionPredict

#include "logisticRegression.hpp"


linear least square regression predict


MType datatype of regression, support double and float
D Number of features that processed each cycle
DDepth DDepth * D is max feature numbers supported.
K Number of weight vectors that processed each cycle
KDepth KDepth * K is max weight vectors supported.
RAMWeight Use which kind of RAM to store weight, could be LUTRAM, BRAM or URAM.
RAMIntercept Use which kind of RAM to store intercept, could be LUTRAM, BRAM or URAM.
template <
    typename MType,
    int D,
    int DDepth,
    int K,
    int KDepth,
    RAMType RAMWeight,
    RAMType RAMIntercept
class logisticRegressionPredict

// fields

static const int marginDepth
sl2 <MType, D, DDepth, K, KDepth,&funcMul <MType>,&funcSum <MType>,&funcAssign <MType>, AdditionLatency <MType>::value, RAMWeight, RAMIntercept> marginProcessor
pickMaxProcess <MType, K> pickProcessor



void pickFromK (
    MType margin [K],
    ap_uint <32> counter,
    ap_uint <32> ws,
    MType& maxMargin,
    ap_uint <32>& maxIndex

pick best weight vector for classification from K vectors


margin K margins generate by K weight vectors.
counter start index of this K margins in all margins.
ws number of margins
maxMargin max of K margins.
maxIndex which index does max margin sits.


void pick (
    hls::stream <MType> marginStrm [K],
    hls::stream <bool>& eMarginStrm,
    hls::stream <ap_uint <32>>& retStrm,
    hls::stream <bool>& eRetStrm,
    ap_uint <32> ws

pick best weight vector for classification


marginStrm margin stream. To get a vector of L margins, marginStrm will be read (L + K - 1) / D times. Margin 0 to K-1 will be read from marginStrm[0] to marginStrm[D-1] at the first time. Then margin D to 2*D - 1. The last round will readin fake data if L is not divisiable by K. These data won’t be used, just to allign K streams.
eMarginStrm Endflag of marginStrm.
retStrm result stream of classification.
eRetStrm Endflag of retStrm.
ws number of weight vectors used.


void predict (
    hls::stream <MType> opStrm [D],
    hls::stream <bool>& eOpStrm,
    ap_uint <32> cols,
    ap_uint <32> classNum,
    hls::stream <ap_uint <32>>& retStrm,
    hls::stream <bool>& eRetStrm

classification function of logistic regression


opStrm feature input streams. To get a vector of L features, opStrm will be read (L + D - 1) / D times. Feature 0 to D-1 will be read from opStrm[0] to opStrm[D-1] at the first time. Then feature D to 2*D - 1. The last round will readin fake data if L is not divisiable by D. These data won’t be used, just to allign D streams.
eOpStrm End flag of opStrm.
cols Feature numbers
classNum Number of classes.
retStrm result stream of classification.
eRetStrm Endflag of retStrm.


void setWeight (
    MType inputW [K][D][KDepth *DDepth],
    ap_uint <32> cols,
    ap_uint <32> classNum

set up weight parameters for prediction


inputW weight
cols Effective weight numbers
classNum number of classes.


void setIntercept (
    MType inputI [K][KDepth],
    ap_uint <32> classNum

set up intercept parameters for prediction


inputI intercept, should be set to zero if don’t needed.
classNum number of classes.

template class xf::data_analytics::regression::linearLeastSquareRegressionPredict

#include "linearRegression.hpp"


linear least square regression predict


MType datatype of regression, support double and float
D Number of features that processed each cycle
DDepth DDepth * D is max feature numbers supported.
RAMWeight Use which kind of RAM to store weight, could be LUTRAM, BRAM or URAM.
RAMIntercept Use which kind of RAM to store intercept, could be LUTRAM, BRAM or URAM.
template <
    typename MType,
    int D,
    int DDepth,
    RAMType RAMWeight,
    RAMType RAMIntercept
class linearLeastSquareRegressionPredict

// fields

sl2 <MType, D, DDepth, 1, 1,&funcMul <MType>,&funcSum <MType>,&funcAssign <MType>, AdditionLatency <MType>::value, RAMWeight, RAMIntercept> dotMulProcessor



void setWeight (
    MType inputW [D][DDepth],
    ap_uint <32> cols

set up weight parameters for prediction


inputW weight
cols Effective weight numbers


void setIntercept (MType inputI)

set up intercept parameters for prediction


inputI intercept should be set to zero if don’t needed.


void predict (
    hls::stream <MType> opStrm [D],
    hls::stream <bool>& eOpStrm,
    hls::stream <MType> retStrm [1],
    hls::stream <bool>& eRetStrm,
    ap_uint <32> cols

predict based on input features and preset weight and intercept


opStrm feature input streams. To get a vector of L features, opStrm will be read (L + D - 1) / D times. Feature 0 to D-1 will be read from opStrm[0] to opStrm[D-1] at the first time. Then feature D to 2*D - 1. The last round will readin fake data if L is not divisiable by D. These data won’t be used, just to allign D streams.
eOpStrm End flag of opStrm.
retStrm Prediction result.
eRetStrm End flag of retStrm.
cols Effective feature numbers.

template class xf::data_analytics::regression::LASSORegressionPredict

#include "linearRegression.hpp"


LASSO regression predict.


MType datatype of regression, support double and float
D Number of features that processed each cycle
DDepth DDepth * D is max feature numbers supported.
RAMWeight Use which kind of RAM to store weight, could be LUTRAM, BRAM or URAM.
RAMIntercept Use which kind of RAM to store intercept, could be LUTRAM, BRAM or URAM.
template <
    typename MType,
    int D,
    int DDepth,
    RAMType RAMWeight,
    RAMType RAMIntercept
class LASSORegressionPredict

// fields

sl2 <MType, D, DDepth, 1, 1,&funcMul <MType>,&funcSum <MType>,&funcAssign <MType>, AdditionLatency <MType>::value, RAMWeight, RAMIntercept> dotMulProcessor



void setWeight (
    MType inputW [D][DDepth],
    ap_uint <32> cols

set up weight parameters for prediction


inputW weight
cols Effective weight numbers


void setIntercept (MType inputI)

set up intercept parameters for prediction


inputI intercept, should be set to zero if don’t needed.


void predict (
    hls::stream <MType> opStrm [D],
    hls::stream <bool>& eOpStrm,
    hls::stream <MType> retStrm [1],
    hls::stream <bool>& eRetStrm,
    ap_uint <32> cols

predict based on input features and preset weight and intercept


opStrm feature input streams. To get a vector of L features, opStrm will be read (L + D - 1) / D times. Feature 0 to D-1 will be read from opStrm[0] to opStrm[D-1] at the first time. Then feature D to 2*D - 1. The last round will readin fake data if L is not divisiable by D. These data won’t be used, just to allign D streams.
eOpStrm End flag of opStrm.
retStrm Prediction result.
eRetStrm End flag of retStrm.
cols Effective feature numbers.

template class xf::data_analytics::regression::ridgeRegressionPredict

#include "linearRegression.hpp"


ridge regression predict


MType datatype of regression, support double and float
D Number of features that processed each cycle
DDepth DDepth * D is max feature numbers supported.
RAMWeight Use which kind of RAM to store weight, could be LUTRAM, BRAM or URAM.
RAMIntercept Use which kind of RAM to store intercept, could be LUTRAM, BRAM or URAM.
template <
    typename MType,
    int D,
    int DDepth,
    RAMType RAMWeight,
    RAMType RAMIntercept
class ridgeRegressionPredict

// fields

sl2 <MType, D, DDepth, 1, 1,&funcMul <MType>,&funcSum <MType>,&funcAssign <MType>, AdditionLatency <MType>::value, RAMWeight, RAMIntercept> dotMulProcessor



void setWeight (
    MType inputW [D][DDepth],
    ap_uint <32> cols

set up weight parameters for prediction


inputW weight
cols Effective weight numbers


void setIntercept (MType inputI)

set up intercept parameters for prediction


inputI intercept, should be set to zero if don’t needed.


void predict (
    hls::stream <MType> opStrm [D],
    hls::stream <bool>& eOpStrm,
    hls::stream <MType> retStrm [1],
    hls::stream <bool>& eRetStrm,
    ap_uint <32> cols

predict based on input features and preset weight and intercept


opStrm feature input streams. To get a vector of L features, opStrm will be read (L + D - 1) / D times. Feature 0 to D-1 will be read from opStrm[0] to opStrm[D-1] at the first time. Then feature D to 2*D - 1. The last round will readin fake data if L is not divisiable by D. These data won’t be used just to allign D streams.
eOpStrm End flag of opStrm.
retStrm Prediction result.
eRetStrm End flag of retStrm.
cols Effective feature numbers.

template class xf::data_analytics::common::SGDFramework

#include "SGD.hpp"


Stochasitc Gradient Descent Framework.


Gradient gradient class which suite into this framework.
template <typename Gradient>
class SGDFramework

    // direct descendants

    template <
        typename MType,
        int WAxi,
        int WData,
        int BurstLen,
        int D,
        int DDepth,
        RAMType RAMWeight,
        RAMType RAMIntercept,
        RAMType RAMAvgWeight,
        RAMType RAMAvgIntercept
    class xf::data_analytics::regression::internal::LASSORegressionSGDTrainer

    template <
        typename MType,
        int WAxi,
        int WData,
        int BurstLen,
        int D,
        int DDepth,
        RAMType RAMWeight,
        RAMType RAMIntercept,
        RAMType RAMAvgWeight,
        RAMType RAMAvgIntercept
    class xf::data_analytics::regression::internal::linearLeastSquareRegressionSGDTrainer

    template <
        typename MType,
        int WAxi,
        int WData,
        int BurstLen,
        int D,
        int DDepth,
        RAMType RAMWeight,
        RAMType RAMIntercept,
        RAMType RAMAvgWeight,
        RAMType RAMAvgIntercept
    class xf::data_analytics::regression::internal::ridgeRegressionSGDTrainer

// typedefs

typedef Gradient::DataType MType

// fields

static const int WAxi
static const int D
static const int Depth
ap_uint <32> offset
ap_uint <32> rows
ap_uint <32> cols
ap_uint <32> bucketSize
float fraction
bool ifJump
MType stepSize
MType tolerance
bool withIntercept
ap_uint <32> maxIter
Gradient gradProcessor



void seedInitialization (ap_uint <32> seed)

Initialize RNG for sampling data.


seed Seed for RNG


void setTrainingConfigs (
    MType inputStepSize,
    MType inputTolerance,
    bool inputWithIntercept,
    ap_uint <32> inputMaxIter

Set configs for SGD iteration.


inputStepSize steps size of SGD iteration.
inputTolerance convergence tolerance of SGD.
inputWithIntercept if SGD includes intercept or not.
inputMaxIter max iteration number of SGD.


void setTrainingDataParams (
    ap_uint <32> inputOffset,
    ap_uint <32> inputRows,
    ap_uint <32> inputCols,
    ap_uint <32> inputBucketSize,
    float inputFraction,
    bool inputIfJump

Set configs for loading trainging data.


inputOffset offset of data in ddr.
inputRows number of rows of training data
inputCols number of features of training data
inputBucketSize bucketSize of jump sampling
inputFraction sample fraction
inputIfJump perform jump scaling or not.


void initGradientParams (ap_uint <32> cols)

Set initial weight to zeros.


cols feature numbers


void calcGradient (ap_uint <WAxi>* ddr)

calculate gradient of current weight


ddr Traing Data


bool updateParams (ap_uint <32> iterationIndex)

update weight and intercept based on gradient


iterationIndex iteraton index.


void train (ap_uint <WAxi>* ddr)

training function


ddr input Data