Vitis Quantitative Finance Library

The Vitis Quantitative Finance Library is a Vitis Library aimed at providing a comprehensive FPGA acceleration library for quantitative finance. It is an open-sourced library that can be used in a variety of financial applications, such as modeling, trading, evaluation and risk management.

The Vitis Quantitative Finance Library provides extensive APIs at three levels of abstraction:

  • L1, the basic functions heavily used in higher level implementations. It includes statistical functions such as Random Number Generation (RNG), numerical methods, e.g., Monte Carlo Simulation, and linear algebra functions such as Singular Value Decomposition (SVD), and tridiagonal and pentadiagonal matrix solvers.
  • L2, the APIs provided at the level of pricing engines. Various pricing engines are provided to evaluate different financial derivatives, including equity products, interest-rate products, foreign exchange (FX) products, and credit products. At this level, each pricing engine API can be seen as a kernel. The customers may write their own CPU code to call different pricing engines under the framework of OpenCL.
  • L3, the software level APIs. APIs of this level hide the details of data transfer, kernel related resources configuration, and task scheduling in OpenCL. Software application programmers may quickly use L3 high-level APIs to run various pricing options without touching the dependency of OpenCL tasks and hardware configurations.

Library Contents

Library Class Description Layer
MT19937 Random number generator L1
MT2203 Random number generator L1
MT19937IcnRng Random number generator L1
MT2203IcnRng Random number generator L1
MT19937BoxMullerNormalRng Produces a normal distribution from a uniform one L1
MultiVariateNormalRng Random number generator L1
SobolRsg Quasi-random number generator L1
SobolRsg1D Quasi-random number generator L1
BrownianBridge Brownian bridge transformation using inverse simulation L1
TrinomialTree Lattice-based trinomial tree structure L1
TreeLattice Generalized structure compatible with different models and instruments L1
Fdm1dMesher Discretization for finite difference method L1
OrnsteinUhlenbeckProcess A simple stochastic process L1
StochasticProcess1D 1-dimentional stochastic process derived by RNG L1
HWModel Hull-White model for tree engine L1
G2Model Two-additive-factor gaussian model for tree engine L1
ECIRModel Extended Cox-Ingersoll- Ross model L1
CIRModel Cox-Ingersoll-Ross model for tree engine L1
VModel Vasicek model for tree engine L1
HestonModel Heston process L1
BKModel Black-Karasinski model for tree engine L1
BSModel Black-Scholes process L1
XoShiRo128PlusPlus XoShiRo128PlusPlus L1
XoShiRo128Plus XoShiRo128Plus L1
XoShiRo128StarStar XoShiRo128StarStar L1
BicubicSplineInterpolation Bicubic Spline Interpolation L1
CubicInterpolation Cubic Interpolation L1
BinomialDistribution Binomial Distribution L1
CPICapFloorEngine Pricing Consumer price index (CPI) using cap/floor methods L2
DiscountingBondEngine Engine used to price discounting bond L2
InflationCapFloorEngine Pricing inflation using cap/floor methods L2
FdHullWhiteEngine Bermudan swaption pricing engine using finite- difference methods based on Hull-White model L2
FdG2SwaptionEngine Bermudan swaption pricing engine using finite- difference methods based on two-additive-factor gaussian model L2
DeviceManager Used to enumerate available Xilinx devices L3
Device A class representing an individual accelerator card L3
Trace Used to control debug trace output L3
Library Function Description Layer
svd Singular Value Decomposition using the Jacobi method L1
mcSimulation Monte-Carlo Framework implementation L1
pentadiagCr Solver for pentadiagonal systems of equations using PCR L1
boxMullerTransform Box-Muller transform from uniform random number to normal random number L1
inverseCumulativeNormalPPND7 Inverse Cumulative transform from random number to normal random number L1
inverseCumulativeNormalAcklam Inverse CumulativeNormal using Acklam’s approximation to transform uniform random number to normal random number L1
trsvCore Solver for tridiagonal systems of equations using PCR L1
PCA Principal Component Analysis library implementation L1
bernoulliPMF Probability mass function for bernoulli distribution L1
bernoulliCDF Cumulative distribution function for bernoulli distribution L1
covCoreMatrix Calculate the covariance of the input matrix L1
covCoreStrm Calculate the covariance of the input matrix L1
covReHardThreshold Hard-thresholding Covariance Regularization L1
covReSoftThreshold Soft-thresholding Covariance Regularization L1
covReBand Banding Covariance Regularization L1
covReTaper Tapering Covariance Regularization L1
gammaCDF Cumulative distribution function for gamma distribution L1
linearImpl 1D linear interpolation L1
normalPDF Probability density function for normal distribution L1
normalCDF Cumulative distribution function for normal distribution L1
normalICDF Inverse cumulative distribution function for normal distribution L1
logNormalPDF Probability density function for log-normal distribution L1
logNormalCDF Cumulative distribution function for log-normal distribution L1
logNormalICDF Inverse cumulative distribution function for log-normal distribution L1
poissonPMF Probability mass function for poisson distribution L1
poissonCDF Cumulative distribution function for poisson distribution L1
poissonICDF Inverse cumulative distribution function for poisson distribution L1
binomialTreeEngine Binomial tree engine using CRR L2
cfBSMEngine Single option price plus associated Greeks L2
FdDouglas Top level callable function to perform the Douglas ADI method L2
hcfEngine Engine for Hestion Closed Form Solution L2
M76Engine Engine for the Merton Jump Diffusion Model L2
MCEuropeanEngine Monte-Carlo simulation of European-style options L2
MCEuropeanPriBypassEngine Path pricer bypass variant L2
MCEuropeanHestonEngine Monte-Carlo simulation of European-style options using Heston model L2
MCmultiAssetEuropeanHestonEngine Monte-Carlo simulation of European-style options for multiple underlying asset L2
MCAmericanEnginePreSamples PreSample kernel: this kernel samples some amount of path and store them to external memory L2
MCAmericanEngineCalibrate Calibrate kernel: this kernel reads the sample price data from external memory and use them to calculate the coefficient L2
MCAmericanEnginePricing Pricing kernel L2
MCAmericanEngine Calibration process and pricing process all in one kernel L2
MCAsianGeometricAPEngine Asian Arithmetic Average Price Engine using Monte Carlo Method Based on Black-Scholes Model : geometric average version L2
MCAsianArithmeticAPEngine arithmetic average version L2
MCAsianArithmeticASEngine Asian Arithmetic Average Strike Engine using Monte Carlo Method Based on Black-Scholes Model : arithmetic average version L2
MCBarrierNoBiasEngine Barrier Option Pricing Engine using Monte Carlo Simulation L2
MCBarrierEngine Barrier Option Pricing Engine using Monte Carlo Simulation L2
MCCliquetEngine Cliquet Option Pricing Engine using Monte Carlo Simulation L2
MCDigitalEngine Digital Option Pricing Engine using Monte Carlo Simulation L2
MCEuropeanHestonGreeksEngine European Option Greeks Calculating Engine using Monte Carlo Method based on Heston valuation model L2
MCHullWhiteCapFloorEngine Cap/Floor Pricing Engine using Monte Carlo Simulation L2
McmcCore Uses multiple Markov Chains to allow drawing samples from multi mode target distribution functions L2
treeSwaptionEngine Tree swaption pricing engine using trinomial tree based on 1D lattice method L2
treeSwapEngine Tree swap pricing engine using trinomial tree based on 1D lattice method L2
treeCapFloprEngine Tree cap/floor engine using trinomial tree based on 1D lattice method L2
treeCallableEngine Tree callable fixed rate bond pricing engine using trinomial tree based on 1D lattice method L2
hjmEngine Full implementation of Heath-Jarrow-Morton framework Pricing Engine with Monte Carlo L2
hjmMcEngine Monte Carlo only implementation of Heath-Jarrow-Morton framework Pricing Engine L2
hjmPcaEngine PCA only implementation of Heath-Jarrow-Morton framework L2
lmmEngine LIBOR Market Model (BGM) framework implementation. L2

Shell Environment

Setup the build environment using the Vitis and XRT scripts, and set the PLATFORM_REPO_PATHS to installation folder of platform files.

source <install path>/Vitis/2019.2/settings64.sh
source /opt/xilinx/xrt/setup.sh
export PLATFORM_REPO_PATHS=/opt/xilinx/platforms

Design Flows

Recommended design flows are categorized by the target level:

  • L1
  • L2
  • L3

The common tool and library prerequisites that apply across all design flows are documented in the requirements section above.

L1

L1 provides the low-level primitives used to build kernels.

The recommend flow to evaluate and test L1 components is described as follows using the Vivado HLS tool. A top level C/C++ testbench (typically main.cpp or tb.cpp) prepares the input data, passes this to the design under test (typically dut.cpp which makes the L1 level library calls) then performs any output data post processing and validation checks.

A Makefile is used to drive this flow with available steps including CSIM (high level simulation), CSYNTH (high level synthesis to RTL), COSIM (cosimulation between software testbench and generated RTL), VIVADO_SYN (synthesis by Vivado), and VIVADO_IMPL (implementation by Vivado). The flow is launched from the shell by calling make with variables set as in the example below:

# entering specific unit test project
cd L1/tests/specific_algorithm/
# Only run C++ simulation on U250
make run CSIM=1 CSYNTH=0 COSIM=0 VIVADO_SYN=0 VIVADO_IMPL=0 DEVICE=u250_xdma_201830_1

As well as verifying functional correctness, the reports generated from this flow give an indication of logic utilization, timing performance, latency and throughput. The output files of interest can be located at the location examples as below where the file names are correlated with the source code. i.e. the callable functions within the design under test.:

Simulation Log: <library_root>/L1/tests/bk_model/prj/solution1/csim/report/dut_csim.log
Synthesis Report: <library_root>/L1/tests/bk_model/prj/solution1/syn/report/dut_csynth.rpt

L2

L2 provides the pricing engine APIs presented as kernels.

The available flow for L2 based around the Vitis tool facilitates the generation and packaging of pricing engine kernels along with the required host application for configuration and control. In addition to supporting FPGA platform targets, emulation options are available for preliminary investigations or where dedicated access to a hardware platform may not be available. Two emulation options are available, software emulation performs a high level simulation of the pricing engine while hardware emulation performs a cycle-accurate simulation of the generated RTL for the kernel. This flow is makefile driven from the console where the target is selected as a command line parameter as in the examples below:

cd L2/tests/GarmanKohlhagenEngine

# build and run one of the following using U250 platform

#  * software emulation
make run TARGET=sw_emu DEVICE=u250_xdma_201830_1
#  * hardware emulation
make run TARGET=hw_emu DEVICE=u250_xdma_201830_1
#  * actual deployment on physical platform
make run TARET=hw DEVICE=u250_xdma_201830_1

# delete all xclbin and host binary
make cleanall

The outputs of this flow are packaged kernel binaries (xclbin files) that can be downloaded to the FPGA platform and host executables to configure and co-ordinate data transfers. The output files of interest can be located at the locations examples as below where the file names are correlated with the source code.:

Host Executable: L2/tests/GarmanKohlhagenEngine/bin_#DEVICE/gk_test.exe
Kernel Packaged Binary: L2/tests/GarmanKohlhagenEngine/xclbin_#DEVICE_#TARGET/gk_kernel.xclbin #ARGS

This flow can be used to verify functional correctness in hardware and enable real world performance to be measured.

L3

L3 provides the high level software APIs to deploy and run pricing engine kernels whilst abstracting the low level details of data transfer, kernel related resources configuration, and task scheduling.

The flow for L3 is the only one where access to an FPGA platform is required.

A prerequisite of this flow is that the packaged pricing engine kernel binaries (xclbin files) for the target FPGA platform target have been made available for download or have been custom built using the L2 flow described above.

This flow is makefile driven from the console to initially generate a shared object (L3/src/output/libxilinxfintech.so).

cd L3/src
source env.sh
make

The shared object file is written to the example location as shown below:

Library: L3/src/output/libxilinxfintech.so

User applications can subsequently be built against this library as in the example provided

cd L3/examples/MonteCarlo
make all
cd output

# manual step to copy or create symlinks to xclbin files in current directory

./mc_example

Benchmark Result