Download (direct link):
10.9 QUASI-RANDOM NUMBER SEQUENCES
Quasi-random sequences aim to reduce the number of samples that must be drawn at random from a given distribution, in order to achieve a certain statistical smoothness; in other
Example Add-ins and Financial Applications
words, to avoid clusters that bias the sample. This is particularly useful in Monte Carlo simulation (see section 10.11). A simulation using a sequence of pseudo-random numbers will involve as many trials as are needed to obtain the required degree of accuracy. The use of a predetermined set of quasi-random samples that cover the sample space more evenly, in some sense, reduces the number of trials while preserving the required statistical properties of the entire set.
In practice such sequences can be thought of simply as arrays of numbers of a given size, the size being predetermined by some analysis of the problem or by experiment. Any function or command that uses this information simply needs to read in the array. Where a command is the end-user of the sequence, you can deposit the array in a range of cells on a worksheet and access this, most sensibly, as a named range from the command’s code (whether it be C/C++ or VB). Alternatively, you can create the array in a persistent structure in the DLL (or VB module). There is little in the way of performance difference between these choices provided that the code executing the simulation reads the array from a worksheet, if that’s where it’s kept, once en bloc rather than making individual cell references.
There is some appeal to creating such sequences in a worksheet - it allows you to verify the statistical properties easily - the only drawback being if the sequence is so large that it risks the spreadsheet becoming unwieldy or stretches the available memory. Where the sequence is to be used by a DLL function, the same choice of worksheet range or DLL structure is there. Provided that the sequence is not so large as to cause problems, the appeal of being able to see and test the numbers is a powerful one.
If the sequence is to be stored in a persistent structure in the add-in, it is advisable to link its existence to the cell that created it, so that deletion of the cell’s contents, or of the cell itself, can be used as a trigger for freeing the resources used. This also enables the return value for the sequence to be passed as a parameter to a worksheet function. (See sections 9.6 Maintaining large data structures within the DLL on page 305 and 9.8 Keeping track of the calling cell of a DLL function on page 309.)
As far as the creation of sequences is concerned, the functions for this are well documented in a number of places. (Clewlow and Strickland). The creation of large sequences can be time-consuming. This may or may not be a problem for your application as, once created, sequences can be stored and reused. Such sequences are a possible candidate for storage in the worksheet using binary names. (See section 8.8 Working with binary names on page 209.) If creation time is a problem, C/C++ makes light work of the task, otherwise VB code might even be sufficient. (Remember that C/C++ with its powerful pointer capabilities, can access arrays much faster than VB.)
10.10 GENERATING CORRELATED RANDOM SAMPLES
When using Monte Carlo simulation (see next section) to model a system that depends on many partially related variables, it is often necessary to generate vectors of correlated random samples from a normal distribution. These are computed using the (real symmetric) covariance matrix of the correlated variables. Once the eigenvalues have been computed (see section 10.3 on page 351)5 they can be combined many times with many
5 Note that this relies on code from Numerical Recipes in C omitted from the CD ROM for licensing reasons
Excel Add-in Development in C/C++
sets of normal samples in order to generate the correlated samples. (See Clewlow and Strickland, Chapter 4.)
In practice, therefore, the process needs to be broken down into the following steps:
1. Obtain or create the covariance matrix.
2. Generate the eigenvalues and eigenvectors from the covariance matrix.
3. Generate a vector of uncorrelated normal samples.
4. Transform these into correlated normal samples using the eigenvalues and eigenvectors.
5. Perform the calculations associated with the Monte Carlo trial.
6. Repeat steps (3) to (5) until the simulation is complete.
The calculation of the correlated samples is essentially one of matrix multiplication. Excel does this fairly efficiently on the worksheet, with only a small overhead of conversion from worksheet range to array of doubles and back again. If the simulation is unacceptably slow, removing this overhead by storing eigenvalues and vectors within the DLL and calculating the correlated samples entirely within the DLL is one possible optimisation.
10.11 MONTE CARLO SIMULATION
Monte Carlo (MC) simulation is a numerical technique used to model complex randomly driven processes. The purpose of this section is to demonstrate ways in which such processes can be implemented in Excel, rather than to present a textbook guide to Monte Carlo techniques.6