libc random number generator flawed?

Question

Consider an algorithm to test the probability that a certain number is picked from a set of N unique numbers after a specific number of tries (for example, with N=2, what's the probability in Roulette (without 0) that it takes X tries for Black to win?).

The correct distribution for this is pow(1-1/N,X-1)*(1/N).

However, when I test this using the following code, there is always a deep ditch at X=31, independently from N, and independently from the seed.

Is this an intrinsic flaw that cannot be prevented due to the implementation specifics of the PRNG in use, is this a real bug, or am I overlooking something obvious?

// C

#include <sys/times.h>
#include <math.h>
#include <stdio.h>

int array[101];
void main(){

    int nsamples=10000000;
    double breakVal,diffVal;
    int i,cnt;

    // seed, but doesn't change anything
    struct tms time;
    srandom(times(&time));

    // sample
    for(i=0;i<nsamples;i++){
        cnt=1;
        do{
            if((random()%36)==0) // break if 0 is chosen
                break;
            cnt++;
        }while(cnt<100);
        array[cnt]++;
    }

    // show distribution
    for(i=1;i<100;i++){
        breakVal=array[i]/(double)nsamples; // normalize
        diffVal=breakVal-pow(1-1/36.,i-1)*1/36.; // difference to expected value
        printf("%d %.12g %.12g\n",i,breakVal,diffVal);
    }
}

Tested on an up-to-date Xubuntu 12.10 with libc6 package 2.15-0ubuntu20 and Intel Core i5-2500 SandyBridge, but I discovered this already a few years ago on an older Ubuntu machine.

I also tested this on Windows 7 using Unity3D/Mono (not sure which Mono version, though), and here the ditch happens at X=55 when using System.Random, while Unity's builtin Unity.Random has no visible ditch (at least not for X<100).

The distribution: enter image description here

The differences: enter image description here

I don't think anyone claims that the random function in glibc is particularly "high quality". If you want something better, then use Mersenne Twister or some other "professional grade" RNG. The one supplied by C libraries [and other similar libraries] tend to be written for simplicity, not "perfection". — Mats Petersson, Feb 04 '13 at 00:36
1) main should return int 2) modulo 36 is suspect, I suggest you first try modulo 32, or another power of two. — wildplasser, Feb 04 '13 at 00:47
I can confirm this behavior (Debian Sid) for both modulo 36 and 32. — liori, Feb 04 '13 at 00:51
what does it do when breaking on other values than zero, eg `if (random() % 36 == 13) break;` — wildplasser, Feb 04 '13 at 00:53
@wildplasser: Apparently not. Looks like the generator generates some nicely divisible number every 31 steps. — liori, Feb 04 '13 at 00:55
@wildplasser: it's a hackish C example, I apologize ;-) And I actually started with 32, but I wanted to keep with the Routlette example. When breaking on other numbers, there's a *peak* at 31. — Wolfram, Feb 04 '13 at 01:02
@mats-petersson: OK, I tried one of the MT's and as expected there is no ditch. But isn't "every" UN*X tools using libc? What do things like pgp/gpg use? — Wolfram, Feb 04 '13 at 01:06
I'm pretty certain pgp/gpg [or any other encryption mechanism that isn't a made out of "cheese"] doesn't use libc's algorithm, although I have to admit I don't know what those particular tools use. — Mats Petersson, Feb 04 '13 at 01:09
Unless you're [developing for Plan 9](http://doc.cat-v.org/plan_9/programming/c_programming_in_plan_9) it would be wise to break the habit of writing `void main`. — dreamlax, Feb 04 '13 at 01:13
I tested this for %32 and all possible break values. For values 0 and 31 there is a ditch, and for all other values there is a peak at 31 - and a smaller peak at 59. — Wolfram, Feb 04 '13 at 01:16
`rand % N` is flawed. My suggestion is to first use a proper rejection-based method, and then re-evaluate. — Oliver Charlesworth, Feb 04 '13 at 01:38
here you go buddy, found this dude's website via google: http://www.bedaux.net/mtrand/ or http://randomlib.sourceforge.net/ — thang, Feb 04 '13 at 09:59

interjay · Answer 1 · 2013-02-04T10:01:11.983

11

This is due to glibc's random() function not being random enough. According to this page, for the random numbers returned by random(), we have:

o_i = (o_i-3 + o_i-31) % 2^31

or:

o_i = (o_i-3 + o_i-31 + 1) % 2^31.

Now take x_i = o_i % 36, and suppose the first equation above is the one used (this happens with a 50% chance for each number). Now if x_i-31=0 and x_i-3!=0, then the chance that x_i=0 is less than 1/36. This is because 50% of the time o_i-31 + o_i-3 will be less than 2^31, and when that happens,

x_i = o_i % 36 = (o_i-3 + o_i-31) % 36 = o_i-3 % 36 = x_i-3,

which is nonzero. This causes the ditch you see 31 samples after a 0 sample.

edited Feb 04 '13 at 10:01

answered Feb 04 '13 at 01:10

interjay

107,303
21
270
254

1

But it's a ditch at 31, not a spike. Also, if I make them relatively prime by using e.g. %49, the ditch is still there. – Wolfram Feb 04 '13 at 01:33
@Wolfram: Yes, I wasn't thinking correctly towards the end of my post, fixed now. – interjay Feb 04 '13 at 09:49

rici · Answer 2 · 2013-02-05T06:38:43.637

What's being measured in this experiment is the interval between successful trials of a Bernoulli experiment, where success is defined as random() mod k == 0 for some k (36 in the OP). Unfortunately, it is marred by the fact that the implementation of random() means that the Bernoulli trials are not statistically independent.

We'll write rnd_i for the i^th output of `random()' and we note that:

rnd_i = rnd_i-31 + rnd_i-3 with probability 0.75

rnd_i = rnd_i-31 + rnd_i-3 + 1 with probability 0.25

(See below for a proof outline.)

Let's suppose rnd_i-31 mod k == 0 and we're currently looking at rnd_i. Then it must be the case that rnd_i-3 mod k ≠ 0, because otherwise we would have counted the cycle as being length k-3.

But (most of the time) (mod k): rnd_i = rnd_i-31 + rnd_i-3 = rnd_i-3 ≠ 0.

So the current trial is not statistically independent of the previous trials, and the 31^st trial after a success is much less likely to succeed than it would in an unbiased series of Bernoulli trials.

The usual advice in using linear-congruential generators, which doesn't actually apply to the random() algorithm, is to use the high-order bits instead of the low-order bits, because high-order bits are "more random" (that is, less correlated with successive values). But that won't work in this case either, because the above identities hold equally well for the function high log k bits as for the function mod k == low log k bits.

In fact, we might expect a linear-congruential generator to work better, particularly if we use the high-order bits of the output, because although the LCG is not particularly good at Monte Carlo simulations, it does not suffer from the linear feedback of random().

random algorithm, for the default case:

Let state be a vector of unsigned longs. Initialize state₀...state₃₀ using a seed, some fixed values, and a mixing algorithm. For simplicity, we can consider the state vector to be infinite, although only the last 31 values are used so it's actually implemented as a ring buffer.

To generate rnd_i: (Note: ⊕ is addition mod 2³².)

state_i = state_i-31 ⊕ state_i-3

rnd_i = (state_i - (state_i mod 2)) / 2

Now, note that:

(i + j) mod 2 = i mod 2 + j mod 2 if i mod 2 == 0 or j mod 2 == 0

(i + j) mod 2 = i mod 2 + j mod 2 - 2 if i mod 2 == 1 and j mod 2 == 1

If i and j are uniformly distributed, the first case will occur 75% of the time, and the second case 25%.

So, by substitution in the generation formula:

rnd_i = (state_i-31 ⊕ state_i-3 - ((state_i-31 + state_i-3) mod 2)) / 2

= ((state_i-31 - (state_i-31 mod 2)) ⊕ (state_i-3 - (state_i-3 mod 2))) / 2 or

= ((state_i-31 - (state_i-31 mod 2)) ⊕ (state_i-3 - (state_i-3 mod 2)) + 2) / 2

The two cases can be further reduced to:

rnd_i = rnd_i-31 ⊕ rnd_i-3

rnd_i = rnd_i-31 ⊕ rnd_i-3 + 1

As above, the first case occurs 75% of the time, assuming that rnd_i-31 and rnd_i-3 are independently drawn from a uniform distribution (which they're not, but it's a reasonable first approximation).

score 1 · Answer 3 · answered Feb 04 '13 at 11:52

As others pointed out, random() is not random enough.

Using the higher bits instead of the lower ones does not help in this case. According to the manual (man 3 rand), old implementations of rand() had a problem in the lower bits. That's why random() is recommended instead. Though, the current implementation of rand() uses the same generator as random().

I tried the recommended correct use of the old rand():

if ((int)(rand()/(RAND_MAX+1.0)*36)==0)

...and got the same deep ditch at X=31

Interstingly, if I mix rand()'s numbers with another sequence, I get rid of the ditch:

unsigned x=0;
//...

        x = (179*x + 79) % 997;
        if(((rand()+x)%36)==0)

I am using an old Linear Congruential Generator. I chose 79, 179 and 997 at random from a primes table. This should generate a repeating sequence of length 997.

That said, this trick probably introduced some non-randomness, some footprint... The resulting mixed sequence will surely fail other statistical tests. x never takes the same value in consecutive iterations. Indeed, it takes exactly 997 iterations to repeat every value.

''[..] random numbers should not be generated with a method chosen at random. Some theory should be used." (D.E.Knuth, "The Art of Computer Programming", vol.2)

For simulations, if you want to be sure, use the Mersenne Twister

libc random number generator flawed?

3 Answers3

Linked