Vitis Quantitative Finance Library¶

The Vitis Quantitative Finance Library is a Vitis Library aimed at providing a comprehensive FPGA acceleration library for quantitative finance. It is an open-sourced library that can be used in a variety of financial applications, such as modeling, trading, evaluation and risk management.

The Vitis Quantitative Finance Library provides extensive APIs at three levels of abstraction:

L1, the basic functions heavily used in higher level implementations. It includes statistical functions such as Random Number Generation (RNG), numerical methods, e.g., Monte Carlo Simulation, and linear algebra functions such as Singular Value Decomposition (SVD), and tridiagonal and pentadiagonal matrix solvers.
L2, the APIs provided at the level of pricing engines. Various pricing engines are provided to evaluate different financial derivatives, including equity products, interest-rate products, foreign exchange (FX) products, and credit products. At this level, each pricing engine API can be seen as a kernel. The customers may write their own CPU code to call different pricing engines under the framework of OpenCL.
L3, the software level APIs. APIs of this level hide the details of data transfer, kernel related resources configuration, and task scheduling in OpenCL. Software application programmers may quickly use L3 high-level APIs to run various pricing options without touching the dependency of OpenCL tasks and hardware configurations.

Library Contents¶

Library Class	Description	Layer
MT19937	Random number generator	L1
MT2203	Random number generator	L1
MT19937IcnRng	Random number generator	L1
MT2203IcnRng	Random number generator	L1
MT19937BoxMullerNormalRng	Produces a normal distribution from a uniform one	L1
MultiVariateNormalRng	Random number generator	L1
SobolRsg	Quasi-random number generator	L1
SobolRsg1D	Quasi-random number generator	L1
BrownianBridge	Brownian bridge transformation using inverse simulation	L1
TrinomialTree	Lattice-based trinomial tree structure	L1
TreeLattice	Generalized structure compatible with different models and instruments	L1
Fdm1dMesher	Discretization for finite difference method	L1
OrnsteinUhlenbeckProcess	A simple stochastic process	L1
StochasticProcess1D	1-dimentional stochastic process derived by RNG	L1
HWModel	Hull-White model for tree engine	L1
G2Model	Two-additive-factor gaussian model for tree engine	L1
ECIRModel	Extended Cox-Ingersoll- Ross model	L1
CIRModel	Cox-Ingersoll-Ross model for tree engine	L1
VModel	Vasicek model for tree engine	L1
HestonModel	Heston process	L1
BKModel	Black-Karasinski model for tree engine	L1
BSModel	Black-Scholes process	L1
XoShiRo128PlusPlus	XoShiRo128PlusPlus	L1
XoShiRo128Plus	XoShiRo128Plus	L1
XoShiRo128StarStar	XoShiRo128StarStar	L1
BicubicSplineInterpolation	Bicubic Spline Interpolation	L1
CubicInterpolation	Cubic Interpolation	L1
BinomialDistribution	Binomial Distribution	L1
CPICapFloorEngine	Pricing Consumer price index (CPI) using cap/floor methods	L2
DiscountingBondEngine	Engine used to price discounting bond	L2
InflationCapFloorEngine	Pricing inflation using cap/floor methods	L2
FdHullWhiteEngine	Bermudan swaption pricing engine using finite- difference methods based on Hull-White model	L2
FdG2SwaptionEngine	Bermudan swaption pricing engine using finite- difference methods based on two-additive-factor gaussian model	L2
DeviceManager	Used to enumerate available Xilinx devices	L3
Device	A class representing an individual accelerator card	L3
Trace	Used to control debug trace output	L3

Library Function	Description	Layer
svd	Singular Value Decomposition using the Jacobi method	L1
mcSimulation	Monte-Carlo Framework implementation	L1
pentadiagCr	Solver for pentadiagonal systems of equations using PCR	L1
boxMullerTransform	Box-Muller transform from uniform random number to normal random number	L1
inverseCumulativeNormalPPND7	Inverse Cumulative transform from random number to normal random number	L1
inverseCumulativeNormalAcklam	Inverse CumulativeNormal using Acklam’s approximation to transform uniform random number to normal random number	L1
trsvCore	Solver for tridiagonal systems of equations using PCR	L1
PCA	Principal Component Analysis library implementation	L1
bernoulliPMF	Probability mass function for bernoulli distribution	L1
bernoulliCDF	Cumulative distribution function for bernoulli distribution	L1
covCoreMatrix	Calculate the covariance of the input matrix	L1
covCoreStrm	Calculate the covariance of the input matrix	L1
covReHardThreshold	Hard-thresholding Covariance Regularization	L1
covReSoftThreshold	Soft-thresholding Covariance Regularization	L1
covReBand	Banding Covariance Regularization	L1
covReTaper	Tapering Covariance Regularization	L1
gammaCDF	Cumulative distribution function for gamma distribution	L1
linearImpl	1D linear interpolation	L1
normalPDF	Probability density function for normal distribution	L1
normalCDF	Cumulative distribution function for normal distribution	L1
normalICDF	Inverse cumulative distribution function for normal distribution	L1
logNormalPDF	Probability density function for log-normal distribution	L1
logNormalCDF	Cumulative distribution function for log-normal distribution	L1
logNormalICDF	Inverse cumulative distribution function for log-normal distribution	L1
poissonPMF	Probability mass function for poisson distribution	L1
poissonCDF	Cumulative distribution function for poisson distribution	L1
poissonICDF	Inverse cumulative distribution function for poisson distribution	L1
binomialTreeEngine	Binomial tree engine using CRR	L2
cfBSMEngine	Single option price plus associated Greeks	L2
FdDouglas	Top level callable function to perform the Douglas ADI method	L2
hcfEngine	Engine for Hestion Closed Form Solution	L2
M76Engine	Engine for the Merton Jump Diffusion Model	L2
MCEuropeanEngine	Monte-Carlo simulation of European-style options	L2
MCEuropeanPriBypassEngine	Path pricer bypass variant	L2
MCEuropeanHestonEngine	Monte-Carlo simulation of European-style options using Heston model	L2
MCmultiAssetEuropeanHestonEngine	Monte-Carlo simulation of European-style options for multiple underlying asset	L2
MCAmericanEnginePreSamples	PreSample kernel: this kernel samples some amount of path and store them to external memory	L2
MCAmericanEngineCalibrate	Calibrate kernel: this kernel reads the sample price data from external memory and use them to calculate the coefficient	L2
MCAmericanEnginePricing	Pricing kernel	L2
MCAmericanEngine	Calibration process and pricing process all in one kernel	L2
MCAsianGeometricAPEngine	Asian Arithmetic Average Price Engine using Monte Carlo Method Based on Black-Scholes Model : geometric average version	L2
MCAsianArithmeticAPEngine	arithmetic average version	L2
MCAsianArithmeticASEngine	Asian Arithmetic Average Strike Engine using Monte Carlo Method Based on Black-Scholes Model : arithmetic average version	L2
MCBarrierNoBiasEngine	Barrier Option Pricing Engine using Monte Carlo Simulation	L2
MCBarrierEngine	Barrier Option Pricing Engine using Monte Carlo Simulation	L2
MCCliquetEngine	Cliquet Option Pricing Engine using Monte Carlo Simulation	L2
MCDigitalEngine	Digital Option Pricing Engine using Monte Carlo Simulation	L2
MCEuropeanHestonGreeksEngine	European Option Greeks Calculating Engine using Monte Carlo Method based on Heston valuation model	L2
MCHullWhiteCapFloorEngine	Cap/Floor Pricing Engine using Monte Carlo Simulation	L2
McmcCore	Uses multiple Markov Chains to allow drawing samples from multi mode target distribution functions	L2
treeSwaptionEngine	Tree swaption pricing engine using trinomial tree based on 1D lattice method	L2
treeSwapEngine	Tree swap pricing engine using trinomial tree based on 1D lattice method	L2
treeCapFloprEngine	Tree cap/floor engine using trinomial tree based on 1D lattice method	L2
treeCallableEngine	Tree callable fixed rate bond pricing engine using trinomial tree based on 1D lattice method	L2
hjmEngine	Full implementation of Heath-Jarrow-Morton framework Pricing Engine with Monte Carlo	L2
hjmMcEngine	Monte Carlo only implementation of Heath-Jarrow-Morton framework Pricing Engine	L2
hjmPcaEngine	PCA only implementation of Heath-Jarrow-Morton framework	L2
lmmEngine	LIBOR Market Model (BGM) framework implementation.	L2

Shell Environment¶

Setup the build environment using the Vitis and XRT scripts, and set the PLATFORM_REPO_PATHS to installation folder of platform files.

source <install path>/Vitis/2019.2/settings64.sh
source /opt/xilinx/xrt/setup.sh
export PLATFORM_REPO_PATHS=/opt/xilinx/platforms

Design Flows¶

Recommended design flows are categorized by the target level:

L1
L2
L3

The common tool and library prerequisites that apply across all design flows are documented in the requirements section above.

L1¶

L1 provides the low-level primitives used to build kernels.

The recommend flow to evaluate and test L1 components is described as follows using the Vivado HLS tool. A top level C/C++ testbench (typically main.cpp or tb.cpp) prepares the input data, passes this to the design under test (typically dut.cpp which makes the L1 level library calls) then performs any output data post processing and validation checks.

A Makefile is used to drive this flow with available steps including CSIM (high level simulation), CSYNTH (high level synthesis to RTL), COSIM (cosimulation between software testbench and generated RTL), VIVADO_SYN (synthesis by Vivado), and VIVADO_IMPL (implementation by Vivado). The flow is launched from the shell by calling make with variables set as in the example below:

# entering specific unit test project
cd L1/tests/specific_algorithm/
# Only run C++ simulation on U250
make run CSIM=1 CSYNTH=0 COSIM=0 VIVADO_SYN=0 VIVADO_IMPL=0 DEVICE=u250_xdma_201830_1

As well as verifying functional correctness, the reports generated from this flow give an indication of logic utilization, timing performance, latency and throughput. The output files of interest can be located at the location examples as below where the file names are correlated with the source code. i.e. the callable functions within the design under test.:

Simulation Log: <library_root>/L1/tests/bk_model/prj/solution1/csim/report/dut_csim.log
Synthesis Report: <library_root>/L1/tests/bk_model/prj/solution1/syn/report/dut_csynth.rpt

L2¶

L2 provides the pricing engine APIs presented as kernels.

The available flow for L2 based around the Vitis tool facilitates the generation and packaging of pricing engine kernels along with the required host application for configuration and control. In addition to supporting FPGA platform targets, emulation options are available for preliminary investigations or where dedicated access to a hardware platform may not be available. Two emulation options are available, software emulation performs a high level simulation of the pricing engine while hardware emulation performs a cycle-accurate simulation of the generated RTL for the kernel. This flow is makefile driven from the console where the target is selected as a command line parameter as in the examples below:

cd L2/tests/GarmanKohlhagenEngine

# build and run one of the following using U250 platform

#  * software emulation
make run TARGET=sw_emu DEVICE=u250_xdma_201830_1
#  * hardware emulation
make run TARGET=hw_emu DEVICE=u250_xdma_201830_1
#  * actual deployment on physical platform
make run TARET=hw DEVICE=u250_xdma_201830_1

# delete all xclbin and host binary
make cleanall

The outputs of this flow are packaged kernel binaries (xclbin files) that can be downloaded to the FPGA platform and host executables to configure and co-ordinate data transfers. The output files of interest can be located at the locations examples as below where the file names are correlated with the source code.:

Host Executable: L2/tests/GarmanKohlhagenEngine/bin_#DEVICE/gk_test.exe
Kernel Packaged Binary: L2/tests/GarmanKohlhagenEngine/xclbin_#DEVICE_#TARGET/gk_kernel.xclbin #ARGS

This flow can be used to verify functional correctness in hardware and enable real world performance to be measured.

L3¶

L3 provides the high level software APIs to deploy and run pricing engine kernels whilst abstracting the low level details of data transfer, kernel related resources configuration, and task scheduling.

The flow for L3 is the only one where access to an FPGA platform is required.

A prerequisite of this flow is that the packaged pricing engine kernel binaries (xclbin files) for the target FPGA platform target have been made available for download or have been custom built using the L2 flow described above.

This flow is makefile driven from the console to initially generate a shared object (L3/src/output/libxilinxfintech.so).

cd L3/src
source env.sh
make

The shared object file is written to the example location as shown below:

Library: L3/src/output/libxilinxfintech.so

User applications can subsequently be built against this library as in the example provided

cd L3/examples/MonteCarlo
make all
cd output

# manual step to copy or create symlinks to xclbin files in current directory

./mc_example

Library Overview

Benchmark Result

Quality and Performance