HomeTechniques and Tips@RISK: General QuestionsRandom Number Generation, Seed Values, and Reproducibility

2.20. Random Number Generation, Seed Values, and Reproducibility

Applies to:
@RISK for Excel 4 and newer
@RISK for Project 4.x
@RISK Developer's Kit 4.x

Tell me more about the algorithm that generates random numbers in @RISK. What is the difference between a fixed seed and a random seed? How does this work when executing a multiple simulation run? Why might my model not be reproducible even though I am using a fixed seed?

Generation Algorithm:

The random number generator used in @RISK is a portable random number generator based on a subtractive method, not linear congruential. The cycle time is long enough that in our testing the cycle time has had no effect on our simulations. Press et al (References, below) say that the period is effectively infinite. The starting seed (if not set manually) is clock dependent, not machine dependent. The method used to generate the random variables for all distributions is inverse transform, but the exact algorithms are proprietary.

Seed Values:

In the @RISK Simulation Settings dialog box, you can set the random number seed. The seed value may be chosen randomly in Simulation Settings by activating the Choose Randomly option, or you can specify a fixed seed by activating the Fixed option and then entering a seed value that is an integer between 1 and 2147483647. If the Fixed option is chosen, the result from your simulation will not change each time it is run (unless you have changed your model or added some random factor out of @RISK's control). If the Choose Randomly option is active, a random seed is chosen based on the computer's clock.

Why choose a fixed seed? There are two main reasons. When you are developing your model, or making changes to an existing model, if you have a fixed random number seed then you can see clearly how any changes in your model affected the results.. With a finished model, you can send the model to someone else and know that if they run a simulation they will get the same results you got. (Both of these statements assume that you're using the same release of @RISK on the identical model and that nothing in the model is volatile; see Reproducibility, below.)

You can also use a RiskSeed() property function on an input distribution to give that distribution its own sequence of random numbers, independent of the seed used for the overall simulation. (RiskSeed() is ignored when used with correlated distributions.)

Multiple @RISK Simulation Runs:

• If the Multiple Simulations Use Different Seed Values box is checked, and the Choose Randomly option is active, @RISK will use a different seed each simulation in a multiple simulation run.
• If the Multiple Simulations Use Different Seed Values box is checked, and the Fixed option is active, each simulation in a multiple simulation run will use a different seed, but the same sequence of seed values will be used each time the run is executed.
• If the Multiple Simulations Use Different Seed Values box is not checked, and the Choose Randomly option is active, each simulation within a multiple simulation run will use the same seed, but a different seed will be used for each run.
• If the Multiple Simulations Use Different Seed Values box is not checked, and the Fixed option is active, the same seed will be used both within and between multiple simulation runs.

@RISK Monte Carlo vs. Latin Hypercube:

The sampling done to generate random numbers during a simulation in @RISK may be Monte Carlo, or it may be Latin Hypercube, depending on which Sampling Type is chosen in the @RISK Simulation Settings dialog. See Latin Hypercube Versus Monte Carlo Sampling or the @RISK manual for more details.

Number of Iterations:

If you change the number of iterations, you have a different model even if nothing else has changed. The overall results will be similar (within normal statistical variation) but not identical. Even the data drawn during the initial iterations may not be the same. For example, if you have a 100-iteration model and increase the number of iterations to 500, the distributions in the new model may sample different values in the first 100 iterations than they had in the 100 iterations of the old model.

If you have a RiskSeed() property function in any distributions, those will preserve the same sequence. For example, if you have a 100-iteration model and increase the number of iterations to 500, the distributions with their own RiskSeed() functions will show the same data for the first 100 iterations as they did for the 100 iterations of the original simulation. (RiskSeed() is ignored when used with correlated distributions.)

Reproducibility:

The results of a simulation are reproducible from run to run if you use a fixed seed value, if your model has not been changed between runs, and if you avoid the following pitfalls:

• The Excel function =RAND(). The numbers generated by these functions are controlled by the spreadsheet, which uses its own independent random number stream. Instead, use RiskUniform(). Consider replacing RAND() functions with RiskBernoulli() or RiskUniform().
• Other volatile Excel functions like NOW() or TODAY().
• Macros that run during the simulation, if the macro code itself is not reproducible from run to run.
• Adding or removing worksheets or opening additional workbooks, even if they don't contain @RISK functions. Results of the new simulation may not be identical because @RISK's order of scanning may be affected. The same applies if you move things around within a worksheet or within a workbook, even if the cells that you moved don't contain @RISK functions.
• References between iterations, when you have Multiple CPU enabled; please see the next section.

Any @RISK inputs that have RiskSeed() property functions will be reproducible, even if the model is changed. Exception: RiskSeed() has no effect in correlated distributions.

Single or Multiple CPU:

Assuming the model is otherwise reproducible, results should be identical whether the simulation runs with multiple CPU enabled or disabled.

There's an important exception. When you're running multiple CPUs, the master CPU parcels out iterations to one or more worker CPUs. During a simulation, one CPU doesn't know the data that were developed by another CPU. So if you have anything in your model that refers to another iteration, directly or indirectly, a simulation with multiple CPUs will not behave as expected. (It won't just be irreproducible; it will be wrong.) Examples would be RiskData() functions that are used in formulas, statistics functions like RiskMean( \) and RiskPercentile() that are used in formulas if you have them set to be computed at every iteration, and macro code that stores data in the workbook or in static variables. In such cases, it is necessary to disable multiple CPU in Simulation Settings.

Versions of @RISK:

Results from a given release of @RISK Standard, Professional, and Industrial should be the same, assuming the model is otherwise reproducible. Trial version versus activated version makes no difference.

Results from different versions of @RISK on the same model will typically match within normal statistical variation, if you use the same random number generator. For the relationship between @RISK 4.x and 5.x random number generation, please see Random Number Generators.

References:

• Donald E. Knuth: Seminumerical Algorithms: Third Edition (1998, Addison-Wesley), vol. 2 of The Art of Computer Programming.
• William H. Press, Brian P. Flannery, Saul A. Teukolsky, William T. Vetterling: Numerical Recipes, The Art of Scientific Computing (1986, Cambridge University Press), pages 198 and 199.

Last edited: 2019-02-15