26

I'm implementing a Monte Carlo simulation in 3 variables in Excel. I've used the RAND() function to sample from Weibull distributions (with long tails). The functions applied to the samples are non-linear but smooth (exp, ln, cos, etc). The result for each sample is a pass/fail, and the overall result is a probability of failure.

I have also implemented this by both numerical integration and Monte Carlo in MathCad, getting the same result both times. MathCad uses (I think) a Mersenne Twister random number generator.

My excel spreadsheet is getting consistently different results (ie always larger). I have checked the equations are the same.

What random number generator does Excel use, and how good is it? Is it possible that this is the source of my problem? I have assumed the Excel implementations of exp, cos etc are ok.

Finally, is there a way to implement Monte Carlo to mitigate against the (known) poor properties of a particular random number generator? (I've heard of Markov chains, random walks etc, but don't really know much about them)

Many thanks.

ashleedawg
  • 20,365
  • 9
  • 72
  • 105
Aaron Lockey
  • 830
  • 1
  • 8
  • 15
  • 2
    The first Google hit: http://support.microsoft.com/kb/828795. Was this clear enough or did you need more? If so, what more do you need? – S.Lott May 04 '11 at 15:51
  • 1
    Doesn't anybody try a search engine before asking questions anymore? Heck, even Lycos gets it. Remember Lycos? http://search.lycos.com/?tab=web&searchForm=true&query=excel+rand&x=0&y=0&diktfc=0710B42B6C7616945830147C3BA877A4A8ABF194B1A3 – Jean-François Corbett May 04 '11 at 20:43
  • P1: All random number generators are less-than-perfect (save perhaps [quantum number generators](http://www.technologyreview.com/blog/arxiv/25041/) discovered last year). P2: Monte Carlo has been implemented (also prior to 2010). Conclusion: There is a way to implement Monte Carlo with a less-than-perfect random number generator. – Jean-François Corbett May 04 '11 at 20:48
  • 1
    @jean Of course I googled it first, but found no answers to my question 'is RAND really good enough for Monte Carlo?' merely lots of examples. Just because it _can_ be done, doesn't make it right in all cases - as I said, my problem is pretty non-linear. – Aaron Lockey May 04 '11 at 23:49
  • About Monte Carlo I don't know, but it's terrible do create hash id's. I get a lot of trouble and found 26 repetitions in a 20 length alphanumeric hash code, in just 25k hash list. This is completely impossible if numbers where just a little bit random. Excel version 2007. – Murta Nov 12 '13 at 20:34
  • Also terrible for generating lottery numbers. I was using Excel 2016 on Win 10 and also OpenOffice Calc. Amazing the number of times you get things like 2/3 numbers being consecutive or in the same decade, e.g. 21 and 27. No wonder I seldom get even 1 number let alone 3 ! Gotta get somethin' better here . . . – Trunk Mar 26 '18 at 11:53
  • 1
    @Trunk True randomness often looks 'clumped', humans have a tendency to see patterns that aren't there. If you want to use a random number generator for lottery numbers, you can do no better than using the last week's draw. No less likely than any other combination... and generated using an excellent random number generator! – Aaron Lockey Jun 22 '18 at 10:13
  • Dunno what you mean by 'clumped'. Dunno if you actually made an Excel sheet to generate randomly selected numbers from a predefined set {1,2,3,.. max} and then hit F9 a few times and see for yourself what I mean by unlikely "random draws". I did. And I got a lot greater uniformity of relative frequency when I riffled & cut the array entries 10 times before randomly selecting my numbers using Java's Math.random(). – Trunk Jun 23 '18 at 21:10

5 Answers5

27

Since this is the top result in Google for "how good is Excel's RAND() function" it is worth updating the answers for later versions of Excel

This paper by Guy Melard "On the accuracy of statistical procedures in Microsoft Excel 2010" tested the RAND() function in Excel 2010 and found it to be substantially improved over 2007 or 2003. Microsoft switched from an incorrect Wichmann and Hill generator (2007/2003) to the Mersenne Twister algorithm which has a much, much greater cycle length.

The authors of that paper ran it through "Small Crush", "Crush" and "Big Crush" tests for randomness and it passed nearly all of the tests.

So while it certainly isn't the same as True random numbers, the RAND() function in Excel 2010, and presumably newer versions, can no longer be considered terrible.

It should be noted however, that Excel 2010 still uses two completely different algorithms for the VBA random number generator, and the RNG that is in the data analysis tool-kit. According to Melard, both of those are still terrible, and in fact the VBA uses the same seed number each each time so produces the same numbers.

My biggest complaints with the random numbers in Excel are

  • You can't set the seed, so the numbers are not reproducible
  • The random numbers update every time you press enter/delete, and even if you set calculation options to Manual, they still update when you save the Excel file
James
  • 65,548
  • 14
  • 155
  • 193
Fairly Nerdy
  • 863
  • 1
  • 8
  • 9
19

There is a journal paper on this topic by McCullough (2008): On the accuracy of statistical procedures in Microsoft Excel 2007 (Computational Statistics and Data Analysis)

Quoting the original article:

The random number generator has always been inadequate. With Excel 2003, Microsoft attempted to implement the Wichmann–Hill generator and failed to implement it correctly. The fixed version appears in Excel 2007 but this fix was done incorrectly. Microsoft has twice failed to implement correctly the dozen lines of code that constitute the Wichmann–Hill generator; this is something that any undergraduate computer science major should be able to do. The Excel random number generator does not fulfill the basic requirements for a random number generator to be used for scientific purposes:

  1. it is not known to pass standard randomness tests, e.g., L’Ecuyer and Simard’s (2007) CRUSH tests (these supersede Marsaglia’s (1996) DIEHARD tests—see Altman et al. (2004) for a comparison);
  2. it is not known to produce numbers that are approximately independent in a moderate number of dimensions;
  3. it has an unknown period length; and
  4. it is not reproducible.

For further discussion of these points, see the accompanying article by McCullough (2008); the performance of Excel 2007 in this area is inadequate.

rcs
  • 67,191
  • 22
  • 172
  • 153
1

Paul Wilmott, in his Quantitative Finance book, simply adds up the results of 12 calls to RAND() and subtracts 6 for a good approximation to a Normal variable. Quick n Dirty

James
  • 9,064
  • 3
  • 31
  • 49
-3

There are commercial products for this. Google turns up two before I got bored of looking

http://www.mathwave.com/articles/random-numbers-excel-worksheets.html

http://www.ozgrid.com/Services/excel-random-number-generator.htm

S.Lott
  • 384,516
  • 81
  • 508
  • 779
-4

RAND() is quite random, but for Monte Carlo simulations, may be a little too random (unless your doing primality testing). Most Monte Carlo simulations just require pseudo-random and deterministic sequences. As part of the Excel Analysis ToolPak RANDBETWEEN() may be all you need for pseudo-random sequences.

Todd Main
  • 28,951
  • 11
  • 82
  • 146