GQE L3 APIs

These APIs are implemented as excutable classes for providing a structure-clean and easy-of-use software user interface.

class xf::database::gqe::Table

#include "gqe_table.hpp"

Overview

handling column-related works (name, section, pointer, etc.)


Methods

Table

Table (std::string _name)

construct of Table .

Parameters:

_name name of the table.

addCol

addCol overload (1)
void addCol (
    std::string _name,
    TypeEnum type_size,
    void* _ptr,
    int row_num
    )

add one column into the Table with user-provided buffer pointer.

usage: tab.addCol(“o_orderkey”, TypeEnum::TypeInt32 , tab_o_col0, 10000);

Parameters:

_name column name
type_size size of the column element data type in bytes
_ptr user-provided column buffer pointer
row_num number of rows
addCol overload (2)
void addCol (
    std::string _name,
    TypeEnum type_size,
    int row_num
    )

allocate buffer for one column and add it into the Table .

usage: tab.addCol(“o_orderkey”, TypeEnum::TypeInt32 , 10000);

Parameters:

_name column name
type_size size of the column element data type in bytes
row_num number of rows
addCol overload (3)
void addCol (
    std::string _name,
    TypeEnum type_size,
    std::vector <std::string> dat_list
    )

create one column with several sections by loading rows from data file.

usage: tab.addCol(“o_orderkey”, TypeEnum::TypeInt32 , {file1.dat,file2.dat});

Parameters:

_name column name
type_size size of the column element data type in bytes
dat_list data file list
addCol overload (4)
void addCol (
    std::string _name,
    TypeEnum type_size,
    std::vector <struct ColPtr> ptr_pairs
    )

create one column with several sections by user-provided pointer list

usage: tab.addCol(“o_orderkey”, TypeEnum::TypeInt32 , {{ptr1,10000},{ptr2,20000}});

Parameters:

_name column name
type_size size of the column element data type in bytes
ptr_pairs vector of (ptr,row_num) pairs

genRowIDWithValidation

genRowIDWithValidation overload (1)
void genRowIDWithValidation (
    std::string _rowid_name,
    std::string _valid_name,
    bool _rowid_en,
    bool _valid_en,
    std::vector <char*> validationPtrVector
    )

add validation column with user-provided validation pointer list

Caution This is an experimental-only API, will be deprecated in the next release.

Parameters:

_rowid_name name of row-id column
_valid_name name of validation bits column
_rowid_en enable flag of row-id
_valid_en enable flag of validation bits
validationPtrVector validation bits pointer list
genRowIDWithValidation overload (2)
void genRowIDWithValidation (
    std::string _rowid_name,
    std::string _valid_name,
    bool _rowid_en,
    bool _valid_en,
    void* ptr,
    int row_num
    )

add validation column with user-provided pointer

Caution This is an experimental-only API, will be deprecated in the next release.

Parameters:

_rowid_name name of row-id column
_valid_name name of validation bits column
_rowid_en enable flag of row-id
_valid_en enable flag of validation bits
ptr validation bits column pointer
row_num number of rows
genRowIDWithValidation overload (3)
void genRowIDWithValidation (
    std::string _rowid_name,
    std::string _valid_name,
    bool _rowid_en,
    bool _valid_en,
    std::vector <std::string> dat_list
    )

add validation column with user-provided data file list

Caution This is an experimental-only API, will be deprecated in the next release.

Parameters:

_rowid_name name of row-id column
_valid_name name of validation bits column
_rowid_en enable flag of row-id
_valid_en enable flag of validation bits
dat_list data file list

setRowNum

void setRowNum (int _num)

set number of rows for the entire table.

Parameters:

_num number of rows of the entire table

getRowNum

size_t getRowNum () const

get number of rows of the entire table.

Returns:

number of rows of the entire table

getSecRowNum

size_t getSecRowNum (int sid) const

get number of rows for the specified section.

Parameters:

sid section ID

Returns:

number of rows of the specified section

getColNum

size_t getColNum () const

get number of columns.

Returns:

number of columns of the table.

getSecNum

size_t getSecNum () const

get number of sections.

Returns:

number of sections of the table.

checkSecNum

void checkSecNum (int sec_l)

divide the columns evenly if section number is greater than 0.

Parameters:

sec_l number of sections, if 0, do nothing since everything is done by addCol with json input.

getColTypeSize

size_t getColTypeSize (int cid) const

get column data type size.

Parameters:

cid column ID

Returns:

date type size of input column id.

getColPointer

char* getColPointer (
    int i,
    int _slice_num,
    int j = 0
    ) const

get buffer pointer.

when getColPointer(2,4,1), it means the 2nd column was divied into 4 sections, return the pointer of the 2nd section

Parameters:

i column id
_slice_num divide column i into _slice_num parts
j get the j’th part pointer after dividing

Returns:

column buffer pointer

getValColPointer

char* getValColPointer (
    int _slice_num,
    int j
    ) const

get the validation buffer pointer

Parameters:

_slice_num number of sections of the validation column
j the index of the section

Returns:

the pointer of the specified section

getColPointer

char* getColPointer (int i) const

get column pointer.

Parameters:

i column id

Returns:

column pointer

setColNames

void setColNames (std::vector <std::string> col_names)

set col_names

Parameters:

col_name column name list

getColNames

std::vector <std::string> getColNames ()

get col_names

Returns:

list of column names

getRowIDColName

std::string getRowIDColName ()

get the name of the row-id column

getValidColName

std::string getValidColName ()

get the name of the validation bits column

getRowIDEnableFlag

bool getRowIDEnableFlag ()

get row-id enable flag

getValidEnableFlag

bool getValidEnableFlag ()

get validation bits enable flag

~Table

~Table ()

deconstructor of Table .

info

void info ()

print information of the table

class xf::database::gqe::Joiner

#include "gqe_join.hpp"

Overview

class Joiner: public xf::database::gqe::Base

Inherited Members


Methods

Joiner

Joiner (FpgaInit& obj)

constructor of Joiner .

Passing FpgaInit obj to Joiner class. Splitting FpgaInit (OpenCL context, program, commandqueue, host/device buffers creation/allocation etc.) and Joiner Init, guaranteens OpenCL stuff are not released after each join call. So the joiner may launch multi-times.

Parameters:

obj the FpgaInit instance.

run

ErrCode run (
    Table& tab_a,
    std::string filter_a,
    Table& tab_b,
    std::string filter_b,
    std::string join_str,
    Table& tab_c,
    std::string output_str,
    int join_type = INNER_JOIN,
    JoinStrategyBase* strategyimp = nullptr
    )

Run join with the input arguments defined strategy, which includes.

  • solution: the join solution (direct-join or partation-join)
  • sec_o: left table sec number
  • sec_l: right table sec number
  • slice_num: the slice number that used in probe
  • log_part, the partition number of left/right table
  • coef_exp_partO: the expansion coefficient of table O result buffer size / input buffer size, this param affects the output buffer size, but not the perf
  • coef_exp_partL: the expansion coefficient of table L result buffer size / input buffer size, this param affects the output buffer size, but not the perf
  • coef_exp_join: the expansion coefficient of result buffer size / input buffer size, this param affects the output buffer size, but not the perf

Usage:

  auto smanual = new gqe::JoinStrategyManualSet(solution, sec_o, sec_l, slice_num, log_part, coef_exp_partO,
coef_exp_partL, coef_exp_join);

  ErrCode err = bigjoin.run(
      tab_o, "o_rowid > 0",
      tab_l, "",
      "o_orderkey = l_orderkey",
      tab_c, "c1=l_orderkey, c2=o_rowid, c3=l_rowid",
      gqe::INNER_JOIN,
      smanual);
  delete smanual;

Table tab_o filter condition like “o_rowid > 0”, o_rowid is the col name of tab_o when no filter conditions, given empty fitler condition “”

The join condition like “left_join_key_0=right_join_key_0” when dual key join is enabled, using comma as the seperator in join condition, e.g. “left_join_key_0=right_join_key_0,left_join_key_1=right_join_key_1”

Output strings are like “output_c0 = tab_a_col/tab_b_col”, when several columns are output, using comma as the seperator

Parameters:

tab_a left table
filter_a filter condition of left table
tab_b right table
filter_b filter condition of right table
join_str join condition(s)
tab_c result table
output_str output columns
join_type INNER_JOIN(default) | SEMI_JOIN | ANTI_JOIN.
strategyimp pointer to an object of JoinStrategyBase or its derived type.

class xf::database::gqe::BloomFilter

#include "gqe_bloomfilter.hpp"

Overview


Methods

BloomFilter

BloomFilter (
    uint64_t num_keys,
    float fpp = 0.05f
    )

constructor of BloomFilter

Calculates the size of the bloom-filter based on the number of unique keys and the equation provided in: https://en.wikipedia.org/wiki/Bloom_filter, as well as allocates buffer for the internal hash-table

Parameters:

num_keys number of unique keys to be built into the hash-table of the bloom-filter
fpp false positive probability (5% by default)

build

void build (
    Table tab_in,
    std::string col_names
    )

build the hash-table with the given key column from input table,

key_names_str should be comma separated, e.g. “key0, key1”

Parameters:

tab_in input table
key_names_str key column names (comma separated) of the input table to be built into hash-table

merge

void merge (BloomFilter& bf_in)

merge the input bloom-filter into the current one

Parameters:

bf_in input bloom-filter

getHashTable

ap_uint <256>** getHashTable () const

get the bloom-filter hash-table

Returns:

hash-table of the bloom-filter

getBloomFilterSize

uint64_t getBloomFilterSize () const

get the bloom-filter size

Returns:

size of the bloom-filter

class xf::database::gqe::Filter

#include "gqe_filter.hpp"

Overview

class Filter: public xf::database::gqe::Base

Inherited Members


Methods

Filter

Filter (FpgaInit& obj)

constructor of Filter .

Initializes hardware as well as loads binary to FPGA by class Base & FpgaInit

Parameters:

obj FpgaInit class object

~Filter

~Filter ()

deconstructor of Filter .

clProgram, commandQueue, and Context will be released by class Base

run

ErrCode run (
    Table& tab_in,
    std::string input_str,
    BloomFilter& bf_in,
    std::string filter_condition,
    Table& tab_out,
    std::string output_str,
    StrategySet params
    )

gqeFilter run function.

Usage:

err_code = Filter.run(
    tab_in,
    "l_orderkey",
    bf_in,
    "19940101<=l_orderdate && l_orderdate<19950101",
    tab_c1,
    "c1=l_extendedprice, c2=l_discount, c3=o_orderdate, c4=l_orderkey",
    params);

Input filter_condition like “19940101<=l_orderdate && l_orderdate<19950101”, l_orderdate must be exsisted in colunm names of the input table, when no filter conditions, input “”

Input key name(s) string like “l_orderkey_0”, when enable dual key join, use comma as seperator, “l_orderkey_0, l_orderkey_1”

Output mapping is like “output_c0 = tab_in_col”, when contains several columns, use comma as seperator

Parameters:

tab_in input table
input_str key column names(s) of the input table to be bloom-filtered
bf_in input bloom-filter from which the hash-table used
filter_condition filter condition used in dynamic filter
tab_out result table
output_str output column mapping
params StrategySet struct contatins number of sections of the input table. params.sec_l = 0: uses section info from input table; params.sec_l >= 1: separates input table into params.sec_l sections evenly

Returns:

error code

class xf::database::gqe::Aggregator

#include "xf_database/gqe_aggr.hpp"

Overview


Methods

Aggregator

Aggregator (std::string xclbin)

construct of Aggregator .

Parameters:

xclbin xclbin path

aggregate

ErrCode aggregate (
    Table& tab_in,
    std::vector <EvaluationInfo> evals_info,
    std::string filter_str,
    std::string group_keys_str,
    std::string output_str,
    Table& tab_out,
    AggrStrategyBase* strategyImp = nullptr
    )

aggregate function.

Usage:

err_code = bigaggr.aggregate(tab_l, //input table
                             {{"l_extendedprice * (-l_discount+c2) / 100", {0, 100}},
                              {"l_extendedprice * (-l_discount+c2) * (l_tax+c3) / 10000", {0, 100, 100}}
                             }, // evaluation
                             "l_shipdate<=19980902", //filter
                             "l_returnflag,l_linestatus", // group keys
                             "c0=l_returnflag, c1=l_linestatus,c2=sum(eval0),c3=sum(eval1)", // mapping
                             tab_c, //output table
                             sptr); //strategy

Input filter_str like “19940101<=o_orderdate && o_orderdate<19950101”, o_orderdate and o_orderdate must be exsisted colunm names in input table when no filter conditions, input “”

Input evaluation information as a struct EvaluationInfo , creata a valid Evaluation struct using initializer list, e.g. {“l_extendedprice * (-l_discount+c2) / 100”, {0, 100}} EvaluationInfo has two members: evaluation string and evaluation constants. In the evaluation string, you can input a final division calculation. Divisor only supports: 10,100,1000,10000 In the evaluation constants, input a constant for each column, if no constant, like “l_extendedprice” above, input zero.

Input Group keys in a string, like “group_key0, group_key1”, use comma as seperator

Output strings are like “c0=tab_in_col1, c1=tab_in_col2”, when contains several columns, use comma as seperator

StrategyImp class pointer of derived class of AggrStrategyBase .

Parameters:

tab_in input table
evals_info Evalutaion information
filter_str filter condition
group_keys_str group keys
out_ptr output list, output1 = tab_a_col1
tab_out result table
strategyImp pointer to an object of AggrStrategyBase or its derived type.