# Sobol Sequence Generator¶

## Overview¶

Sobol sequence generator (SSG) is one of the critical utilities needed by Monte-Carlo Simulation. The SSG utility can generate the sequence with a quasi-random distribution. The Sobol sequence is one of the most popular quasi-random sequence for its simplicity and efficiency in implementation. Specifically, sobol sequence in base b=2, it can be implemented efficiently using bit-vector operations to compute the radical inverse and there is only one point in the field [0,1) .

## Algorithm¶

Low-Discrepancy (LD) Sequence is an indicator for how evenly a random sequence is distributed in sample space. The definition of discrepancy,

$D_{N}(P) = \underset{B \in J} {sup} \lvert \frac {A(B)}{N} - \lambda _{s}(B) \rvert$

$\Phi _{b,C}(i) = (b^{-1} \cdots b^{-M}) \begin{bmatrix} C( a_{0}(i) \cdots a_{M-1}(i))^{T} \end{bmatrix}$

where $$i =\sum_{l=0}^{M-1} a_{l}(i)b^{l}$$, $$b$$ is a non-negative integer. For each dimension of sobol sequence, it is comprised by a $$b=2$$ radical inversion, which have a individual matrix C inside.

## Gray Code Implementation¶

Here, we give a brief outline of the implementation. With a pre-calculated matrix C, where the matrix-vector multiplications are performed in the finite field $$Z_{b}$$ see sobol reference. To generate the j-th component of the points in a Sobol sequence, we need to choose a primitive polynomial of some degree $$S_{j}$$ in the field $$Z$$, where $$Z_{2}={0,1}$$, the equation is shown as follow:

$x^{S_{j}} + a_{1,j} x^{S_{j}-1} + a_{2,j} x^{S_{j}-2} + \cdots + a_{S_{j}-1,j} x^{1} + 1$

where the coefficients $$a_{1,j} , a_{2,j} , \cdots , a_{S_{j}-1,j}$$ are either 0 or 1. We define a sequence of positive integers $$\{ m_{1,j}, m_{2,j} , \cdots \}$$ by the recurrence relation:

$m_{k,j} := 2a_{1,j} m_{k-1,j} \oplus 2^{2} a_{2,j} m_{k-2,j} \oplus \cdots \oplus \ 2^{ S_{j}-1 } a_{S_{j}-1,j} m_{ k-S_{j} + 1, j} \oplus 2^{S_j} m_{k-S_j,j} \oplus m_{k-S_j,j}$

where $$\oplus$$ is the bit-by-bit exclusive-or operator. The initial values $$m_{1,j} , m_{2,j} , \cdots , m_{ S_{j},j}$$ can be chosen freely provided that each $$m_{k,j} , 1 \le k \le S_{j}$$ , is odd and less than $$2^{k}$$ . The so-called direction numbers $$\{ \nu_{1,j}, \nu_{2,j}, \cdots \}$$ are defined by:

$\nu_{1,j} :=\frac { m_{k,j} } { 2^{k} }$

Then the $$x_{i,j}$$ element of the j-th dimension of the i-th point in a Sobol sequence, can be generated by:

$x_{i,j} := i_{1} \nu_{1,j} \oplus i_{2} \nu_{2,j} \oplus \cdots \nu_{n,j}$

where $$i_{k}$$ is the $$k$$-th digit from the right-hand side when $$i$$ is written in binary $$i = ( \cdots i_{3} i_{2} i_{1} )_{2}$$. Here and elsewhere in this article, we use the notation $$()_{2}$$ to denote the binary representation of numbers. The formula above corresponds to the original implementation of Sobol. The (binary-reflected) Gray code of an integer $$i$$ is defined as:

$gray(i) := i \oplus \lfloor \frac {i}{2} \rfloor =(\cdots i_3 i_2 i_1)_2 \oplus ( \cdots i_{4} i_{3} i_{2})_{2}$

It has the property that the binary representations of $$gray(i)$$ and $$gray(i-1)$$ differs in only one position, namely, the index of the first 0 digit from the right in the binary representation of $$i-1$$.

We generate the points of Sobol using

$\bar{x}_{i,j} := g_{i,1} \nu_{1,j} \oplus g_{i,2} \nu_{2,j} \oplus \cdots$

where $$g_{i,k}$$ is the k-th digit from the right of the Gray code of $$i$$ in binary, i.e., $$gray(i) = ( \cdots g_{3} g_{2} g_{1})_{2}$$. Equivalently, since $$gray(i)$$ and $$gray(i-1)$$ differ in one known position, we can generate the points recursively using:

$\bar{x}_{0,j} := 0 , \bar{x}_{i,j} := \bar{x}_{i-1,j} \oplus \bar{x}_{c_{i-1},j}$

where $$c_{i}$$ is the index of the first 0 digit from the right in the binary representation of $$i = ( \cdots i_{3} i_{2} i_{1})_{2}$$. We have $$c_{0}=1, c_{1}=2, c_{2}=1, c_{3}=3, c_{4}=1, c_{5}=2$$, etc.

With the implementation of Gray Code, we simply obtain the points in a different order, which still preserve their uniformity properties. It is because every block of $$2^{m}$$ points for $$m=0,1, \cdots$$ is the same as the original implementation. It is known that the two equations closely mentioned above can generate exactly the same sequence, and the former one allows one to start from any position in the sequence, while the later one is recursive and more computationally efficient. The reference is equation reference.

### Sobol Workflow:¶

The initialization function will set $$addr=0$$ and read parameter list to initialize polynomial $$m_{ S_{j},j}$$. The parameter reference file new-joe-kuo-6.21201 is given by sobol reference mentioned above. The parameter field is shown as below: