8

I'm trying to implement a gaussian distributed random number generator in the interval [0,1].

float rand_gauss (void) {
  float v1,v2,s;

  do {
    v1 = 2.0 * ((float) rand()/RAND_MAX) - 1;
    v2 = 2.0 * ((float) rand()/RAND_MAX) - 1;

    s = v1*v1 + v2*v2;
  } while ( s >= 1.0 );

  if (s == 0.0)
    return 0.0;
  else
    return (v1*sqrt(-2.0 * log(s) / s));
}

It's pretty much a straight forward implementation of the algorithm in Knuth's 2nd volume of TAOCP 3rd edition page 122.

The problem is that rand_gauss() sometimes returns values outside the interval [0,1].

2 Answers2

8

Knuth describes the polar method on p 122 of the 2nd volume of TAOCP. That algorithm generates a normal distribution with mean = 0 and standard deviation = 1. But you can adjust that by multiplying by the desired standard deviation and adding the desired mean.

You might find it fun to compare your code to another implementation of the polar method in the C-FAQ.

Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185
4

Change your if statement to (s >= 1.0 || s == 0.0). Better yet, use a break as seen in the following example for a SIMD Gaussian random number generating returning a complex pair (u,v). This uses the Mersenne twister random number generator dsfmt(). If you only want a single, real, random-number, return only u and save the v for the next pass.

inline static void randn(double *u, double *v)
{
double s, x, y;    // SIMD Marsaglia polar version for complex u and v
while (1){
   x = dsfmt_genrand_close_open(&dsfmt) - 1.; 
   y = dsfmt_genrand_close_open(&dsfmt) - 1.;
   s = x*x + y*y;
   if (s < 1) break;
}
 s = sqrt(-2.0*log(s)/s);
*u = x*s; *v = y*s;
return;
}

This algorithm is surprisingly fast. Execution times for computing two random numbers (u,v) for four different Gaussian random number generators are:

Times for delivering two Gaussian numbers (u + iv)
i7-2600K @ 4GHz, gcc -Wall -Ofast -msse2 ..

gsl_ziggurat                        =       20.3 (ns) 
Box-Muller                          =       78.8 (ns) 
Box-Muller with fast_sin fast_cos   =       28.1 (ns) 
SIMD Marsaglia polar                =       35.0 (ns)

The fast_sin and fast_cos polynomial routines of Charles K. Garrett speed up the Box-Muller computation by a factor 2.9 using a nested polynomial implementation of cos() and sin(). The SIMD Box Muller and polar algorithms are certainly competitive. Also they can be parallelized easily. Using gcc -Ofast -S, the assembly code dump shows that the square root is the SIMD SSE2: sqrt --> sqrtsd %xmm0, %xmm0

Comment: it is really hard and frustrating to get accurate timings with gcc5, but I think these are ok: as of 2/3/2016: DLW

[1] Related link: c malloc array pointer return in cython

[2] A comparison of algorithms, but not necessarily for SIMD versions: http://www.doc.ic.ac.uk/~wl/papers/07/csur07dt.pdf

[3] Charles K. Garrett: http://krisgarrett.net/papers/l2approx.pdf

Community
  • 1
  • 1
W9DKI
  • 121
  • 1
  • 6
  • 1
    here is the hard part to get right: gcc -Wall -Ofast -msse2 -frename-registers -malign-double -fno-strict-aliasing -DHAVE_SSE2=1 -DDSFMT_MEXP=19937 -o randn_test dSFMT.c randn_test.c -lm - – W9DKI Jan 31 '16 at 11:00
  • 1
    I guess @W9DKI, you've created a dupe account by mistake. When you login the next time, login the same way you have done the first way and not through any other way. Also flag for merger of accounts. See this http://stackoverflow.com/help/merging-accounts – Bhargav Rao Feb 01 '16 at 07:03
  • Jonathan, thanks for changing the s=0 typo to s==0. Instead of using while(1), a for(;;) would be a bit tighter, but gcc -Ofast compiles both to the same thing. – W9DKI Feb 07 '16 at 11:50
  • there is an obvious factor of 2 left out, i.e. x = 2.*dsfmt_genrand_close_open(&dsfmt) - 1.0; y = 2.*dsfmt_genrand_close_open(&dsfmt) - 1.0 – W9DKI Mar 22 '16 at 03:10
  • x = dsfmt_genrand_close_open(&dsfmt) - 1.; y = dsfmt_genrand_close_open(&dsfmt) - 1.; – W9DKI Mar 22 '16 at 03:12