7

I am playing around with PRNGs (like Mersenne Twister and rand() function of stdlib) and I would want a good test that would help me ascertain the quality of random data produced by the PRNGs. I have calculated the value of Pi using random numbers generated by the PRNGs, and I find rand() and Mersenne Twister to be very close to offer a distinction (do I need to scrutinize after 10 decimal points?).

I do not have much idea about Monte Carlo simulations; please let me know about some algorithm/application (possibly something simple yet which could provide good inferences) that would help me distinguish them in terms of quality.


EDIT 1: I didn't notice before, but there is a similar thread: How to test random numbers?

EDIT 2: I am not able to interprete the results of NIST, as mentioned in one of the comments. I got this idea of visually interpreting the pattern (if any) from random.org and am following that because of it's simplicity. I would be very glad if someone could comment on the process of my testing:

  1. Generate N randoms from [0,1] using rand() and MT1997
  2. if (round(genrand_real1() / rand_0_1())) then red pixel, else black

As I understand that this is not a very precise solution, but if this provides a reasonable estimate, then I could live with this at the present moment.

plasmacel
  • 8,183
  • 7
  • 53
  • 101
Sayan
  • 2,662
  • 10
  • 41
  • 56
  • i'm not so sure about getting any **random data** from **pseudorandom number generators** - but i think you could implement http://en.wikipedia.org/wiki/Fair_coin#Fair_results_from_a_biased_coin with them.. – Aprillion Mar 20 '12 at 01:36
  • are you saying that because the values generated from PRNGs are predictable? thank you – Sayan Mar 20 '12 at 01:40
  • yes, that is the distinction - it was just a reminder for you to check whether a PRNG is good enought for your application and you don't need a TRNG like [random.org](http://random.org) – Aprillion Mar 20 '12 at 11:14

3 Answers3

12

There are several statistical tests suites available. I wrote, copied, and otherwise gathered together 120 PRNGs and tested each with a variety of tests suites given 4 hours per PRNG per test suite:

  • PractRand (standard, 1 terabyte) found bias in 78 PRNGs
  • TestU01 (BigCrush) found bias in 50 PRNGs
  • RaBiGeTe (extended, 512 megabit, x1) found bias in 40 PRNGs
  • Dieharder (custom command line options) found bias in 25 PRNGs
  • Dieharder (-a command line option) found bias in 13 PRNGs
  • NIST STS (default, 64 megabit x128) found bias in 11 PRNGs

How many of those were in PRNGs that the other test suites all missed?

  • PractRand (standard, 1 terabyte) found 22 unique biases, in a wide variety of categories.
  • RaBiGeTe (extended, 512 megabit, x1) found 5 unique biases, all in LCGs and combination LCGs.
  • TestU01 BigCrush found 2 unique biases, both in small chaotic PRNGs.
    No other test suite found any unique biases.

In short, only PractRand, TestU01, and possibly RaBiGeTe are worth using.

Full disclosure: I wrote PractRand, so either the set of PRNGs or any other non-qualitative measure could be biased in its favor.

Miscellaneous advantages:

  • PractRand and TestU01 tend to be the easiest to interpret the output of in my opinion.
  • PractRand and Dieharder tend to be the easiest to automate testing for via command line interface I think.
  • PractRand and RaBiGeTe were the only ones to support multithreaded testing.

Miscellaneous disadvantages:

  • PractRand required more bits of input to test than other test suites - could be a problem if your RNG is very slow or otherwise limited on amount of data produced.
  • RaBiGeTe and NIST STS both have interface issues.
  • Dieharder and NIST STS both have false-positive issues.
  • NIST STS had the worst interface in my opinion.
  • I could not get Dieharder to compile on Windows. I managed to get TestU01 to compile on windows but it took some work.
  • Recent versions of RaBiGeTe are closed-source and windows-only.

The set of PRNGs tested: The PRNG set includes 1 large GFSR, 1 large LFSR, 4 xorshift type PRNGs, 2 xorwow type PRNGs, 3 other not-quite-LFSR PRNGs. It includes 10 simple power-of-2 modulus LCGs (which discard low bits to reach acceptable quality levels), 10 power-of-2 modulus not-quite-LCGs, and 9 combination generators based primarily around LCGs and not-quite-LCGs. It includes 19 reduced strength versions of CSPRNGs, plus one full strength CSPRNG. Of those, 14 were based upon indirection / dynamic s-boxes (e.g. RC4, ISAAC), four were ChaCha/Salsa parameterizations, and the remaining 2 were Trivium variants. It includes 11 PRNGs broadly classifiable as LFib-type or similar, not counting the LFSRs/GFSRs. The rest (about 35) were small state chaotic PRNGs, of which 10 used multiplication and the others were limited to arithmetic and bitwise logic.

Edit: There is also the test set in gjrand, which is very obscure and a little odd, but actually does extremely well.

Also, all of the PRNGs tested are included as non-recommended PRNGs in PractRand.

plasmacel
  • 8,183
  • 7
  • 53
  • 101
user3535668
  • 418
  • 5
  • 6
  • 2
    I'm happy to recommend your answer, however, as written there is no evidence. Can you provide links to papers, that back up your claim? Or a some instructions on how to repeat your experiments. – csgillespie Jul 06 '16 at 11:00
  • 3
    No evidence whatsoever! DieHarder has around 106 tests. PractRand and TestU01 are written in C and C++ and require the user to integrate their generator! The easiest ever to use so far are DieHarder (ubuntu package) and NIST STS (python UI and implementation)! I strongly believe you're biased to your work and indeed as @csgillespie mentioned in his comment, you need to provide a paper that supports your claims though! – Yahya Jan 07 '20 at 11:44
5

There are two standard test suites for testing random numbers.

  1. NIST test suite. They have an implementation in C.
  2. Diehard Test Suite (developed by George Marsaglia). There is a C library implementation of these tests.

There is a R interface to the Dieharder library, called RDieHarder. This library provides an interface to both the NIST and Diehard test suites.

csgillespie
  • 59,189
  • 14
  • 150
  • 185
  • I am using NIST, but I think although my test passes, there's some problem; can you please help? - I have generated long random values, and converted them to binary and stored in a file. Say 100 randoms are there in the file, I do assess 100 and follow the steps in the "Running the Test Code" section of the doc (chose the bit streams generated to be 10 as in the doc). But I see that my "finalAnalysisReport.txt" for the default test case contains almost no info. – Sayan Mar 21 '12 at 15:39
  • Your best bet is to ask another question. – csgillespie Mar 21 '12 at 21:27
  • 2
    This answer was good, but is outdated now. See the other answer for an update (TL;DR: L'Ecuyer's TestU01, or PractRand). – Fab Jul 05 '16 at 12:08
0

You're best off looking into the volume 2 of the Knuth's series.

For a shorter read, look up the corresponding chapter of the Numerical Recipes.

If you're only interested in a sort of a baseline for a MC simulation--- linear congruential generators are best avoided, Mersenne Twister is good enough in the vast majority of cases.

ev-br
  • 24,968
  • 9
  • 65
  • 78
  • Can you give a link to evidence for the claim that LGCs are best avoided for Monte-Carlo simulations? – ziggystar Aug 19 '13 at 08:58
  • @ziggystar: well, you can look up in Knuth. Or Numerical Recipes. Or google up the results the standard test suites, eg from the csgillespie answer. – ev-br Sep 20 '13 at 23:38