Internal Design of Asian Option Pricing Engine¶
Overview¶
The Asian option pricing engine uses Monte Carlo Simulation to estimate the value of the Asian option. Here, we assume the process of asset pricing applies to BlackScholes process.
Asian Option is kind of exotic option. The payoff is path dependent and it is dependent on the average price of underlying asset over the settled period of time \(T\).
The payoff of Asian options is determined by the arithmetic or geometric average underlying price over some preset period of time. This is different from the case of usual European Option and American Option, where the payoff of the option depends on the price of the underlying at exercise. One advantage of Asian option is the relative cost of Asian option compared to American options. Because of the averaging feature, Asian options are typically cheaper than American options.
The average price of underlying asset could be used as strike price or the underlying settlement price at the expiry time. When the average price of underlying asset is used as the underlying settlement price at the expiry time, the payoff is calculated as follows:
payoff of put option = \(\max(0, K  A(T))\)
payoff of call option = \(\max(0, A(T)  K)\)
When the average price of underlying asset is used as the strike price at the expiry time, the payoff is calculated as follows:
payoff of put option = \(\max(0, A(T)  S_T)\)
payoff of call option = \(\max(0, S_T  A(T))\)
Where \(T\) is the time of maturity, \(A(T)\) is the average price of asset during time \(T\), \(K\) is the fixed strike, \(S_T\) is the price of underlying asset at maturity.
The average could be arithmetic or geometric, which is configurable. \(N\) is the number of discrete steps from \(0\) to \(T\).
Arithmetic average: \(A(T) = \frac{\sum_{i=0}^N S_i}{N}\)
Geometric average: \(A(T) = \sqrt[n]{\prod_{i=0}^N S_i}\)
MCAsianAPEngine¶
The pricing process of Asian Arithmetic Pricing engine is as follows:
 Generate independent stock paths by using Mersenne Twister Uniform MT19937 Random Number Generator (RNG) followed by Inverse Cumulative Normal Uniform Random Numbers.
 Start at \(t = 0\) and calculate the stock price of each path firstly in order to achieve initiation interval (II) = 1.
 Calculate arithmetic average and geometric average value of each path.
 Calculate the payoff difference of arithmetic average price and geometric average price.
 Calculate the payoff off of geometric average price \(Payoff_{ref}\) based on analytical method.
 Payoff of average pricing is \(Payoff = Payoff_{ref} + Payoff_{gap}\)
The pricing architecture on FPGA can be shown as the following figure:
MCAsianASEngine¶
The pricing process of Asian Arithmetic Strike engine is as follows:
 Generate independent stock paths by using Mersenne Twister Uniform MT19937 Random Number Generator (RNG) followed by Inverse Cumulative Normal Uniform Random Numbers.
 Start at \(t = 0\) and calculate the stock price of each path firstly in order to achieve II = 1.
 Accumulate the sum of lognormal stock price using the following analytical solution.
where \(M\) is the total timesteps, \(dt\) is the time interval 4. Calculate the final payoff by taking the strike price \(Strike_t\) as the previous arithmetic average price \(\frac{\sum_{j=0}^i S_j}{M}\).
The pricing architecture on FPGA can be shown as the following figure:
MCAsianGPEngine¶
The pricing process of Asian Geometric Pricing engine is as follows:
 Generate independent stock paths by using Mersenne Twister Uniform MT19937 Random Number Generator (RNG) followed by Inverse Cumulative Normal Uniform Random Numbers.
 Start at \(t = 0\) and calculate the stock price of each path firstly in order to achieve II = 1.
 Transfer the geometric average of stock price to sum of lognormal stock price.
 Accumulate the sum of lognormal stock price using the following analytical solution.
 Calculate the final payoff by using a fixed strike price.
The pricing architecture on FPGA can be shown as the following figure:
Note
The 3 figures above shows the pricing part of McAsianAPEngine, McAsianASEngine and McAsianGPEngine respectively; the other parts, for example, PathGenerator, MCSimulation and other modules, are the same as in MCEuropeanEngine.
Profiling¶
The hardware resources are listed in Table 4. The Arithmetic and Geometric Asian Engines demand similar amount of resources.
Engines  BRAM  DSP  Register  LUT  Latency  clock period(ns) 
McAsianArithmeticAPEngine  12  59  24664  26826  53276  3.423 
McAsianArithmeticASEngine  12  65  26683  29362  53196  3.423 
McAsianGeometricAPEngine  10  61  24626  26657  53222  3.423 
Table 5 shows the performance improvement in comparison with CPUbased Quantlib result (Tolerance = 0.02)
Engines  McAsianAPEngine  McAsianASEngine  McAsianGPEngine 
SampNum  25951  33642  46805 
CPU result  1.98441  3.10669  3.28924 
CPU Execution time (us)  224911  310856  601068 
FPGA result  1.89144  3.0866  3.26228 
FPGA Kernel Execution time (us)  563.05  734.42  830.130 
FPGA SampNum  28672  34816  49152 
FPGA E2E Execution time (ms)  1  1  1 
Number of MCM  2  2  4 
Table 6 shows the max performance of McAsianEngines in one SLR of Xilinx xcu250figd21042Le (Vivado report).
Engines  BRAM  DSP  Register  LUT  Latency 

Max Unroll Num 
McAsianArithmeticAPEngine  192  749  246325  205626  54056  300  16 
McAsianArithmeticASEngine  192  755  247524  207476  54109  300  16 
McAsianGeometricAPEngine  160  781  242927  200106  54002  300  16 