Random generators with multiple (uncorrelated?) distributions in c++

Question

Having read the following questions:

using one random engine for multi distributions in c++11

Uncorrelated parallel random seeds with C++ 2011?

std::default_random_engine generate values between 0.0 and 1.0

how to generate uncorrelated random sequences using c++

Using same random number generator across multiple functions

and having experienced a few tricks has rised doubts in my conceptual understanding on random generators for multiple (different) distributions in c++. In particular:

Is it OK to use one generator for drawing numbers in different distributions (uniform, binomial, ...) as long as you don't multithread?

For instance, assume i'm using the following:

class Zsim {
    private:
     std::default_random_engine engine;
}

and initializing it in the constructor:

Zsim::Zsim(...)
{
    std::random_device rd;
    std::default_random_engine generator(rd());
    engine = generator;
}

and using it to draw n values (n possibly large) in different distributions (binomial and uniform), let say:

std::binomial_distribution<int> B_distribution(9, 0.5);
int number = B_distribution(engine);

std::uniform_real_distribution<double> R_distribution(0, 15);
position.x = R_distribution(engine);
position.y = R_distribution(engine);

is this considered OK?

Some pointed out that using std::random_device is nice while others suggested it can throw for a number of reasons and should be avoided or try/catched (see: Using same random number generator across multiple functions).

In using one random engine for multi distributions in c++11, it was suggested that, when simulating a random or brownian motion in n-dimensions (n=2 in the example given by MosteM), you need one generator per dimension, otherwise they become correlated, producing an artificial drift. While I agree with this assertion, what is the validity of this assertion given the (huge) period of the generator? If the simulation is large (high number of steps)? Should we always use one generator per dimension as a security? It appears to be in contradiction with the lead reply in how to generate uncorrelated random sequences using c++

Finally, given Zsim example, when you add a const qualifier to a method and draw for the binomial distribution:

int Zscim::get_randomB() const
{ 
    std::binomial_distribution<int> B_distribution(9, 0.5); 
    int number = B_distribution(engine);
 }

the compiler throws an error: expression having type 'const std::tr1::default_random_engine' would lose some const-volatile qualifiers in order to call 'unsigned long std::tr1::mersenne_twister<_Ty,_Wx,_Nx,_Mx,_Rx,_Px,_Ux,_Sx,_Bx,_Tx,_Cx,_Lx>::operator()(void)

Suggesting that the generator 'engine' is altered in some way when calling the distribution. What is causing this?

You pass the engine to the distribution (that's where it gets its numbers from) so of course the engin gets used and changes its internal state. — Galik, Jan 04 '18 at 10:55
Thank you! It helps. Does it mean that drawing in different distributions (rand, normal, binomial, ...) could possibly introduce a bias for subsequent calls ? (ie. drawing in a rand then in binomial and then in a rand again may cause the last draw to be less acurate)? — Grasshoper, Jan 04 '18 at 11:17
Cross reference regarding your “as long as you don't multithread”: I asked about [random numbers for multiple threads](https://stackoverflow.com/q/14923902/1468366) in the past. — MvG, Jan 04 '18 at 11:29
@Grasshoper I don't see how tbh. I would think that would be a weakness in the random number generator but I'm not a random number expert. — Galik, Jan 04 '18 at 11:29
@MvG Thank you! Didn't see that one, It's a nice source of information. If I understood well, it means that i don't really need to care about independant drawings as long as i do not draw them sequentially. — Grasshoper, Jan 04 '18 at 11:51
@Galik i'm no expert too, that's why I'm asking. I assume that if the engine state is changed, it may have implications elsewhere. — Grasshoper, Jan 04 '18 at 11:54

score 2 · Accepted Answer · answered Jan 04 '18 at 11:44

If you read about UniformRandomBitGenerator you will find that the random generator will generate random bits which ideally are pretty much independent from one another, to the extent the PRNG in question can achieve this. So essentially every call to engine() will generate one almost uncorrelated integer. It's the task of the distribution to make the appropriate number of calls to this. A single bit distribution might make a single call to a 32bit engine for every 32 calls to the distribution itself, caching unused entropy between calls. Conversely a double precision number generator might use entropy from two 32bit engine results to determine all 53 mantissa bits of a double. The engine doesn't care which distribution consumes its random bits, so using the same engine in different distributions isn't a problem.

If you read https://en.wikipedia.org/wiki/Mersenne_Twister you will find that it is

k-distributed to 32-bit accuracy for every 1 ≤ k ≤ 623 (for a definition of k-distributed, see below)

So if you use std::mt19937 I'd say you should be safe to use the same engine in up to 623 different distributions, no matter whether they are of the same or different types. For more distributions it depends on how they are to be used, but in most cases I wouldn't worry too much either.

Thank you for this neat & detailed reply! It would be good to have some kind of guidelines for these (repeated) questions somewhere (maybe your topic https://stackoverflow.com/questions/14923902/random-numbers-for-multiple-threads would serve as a starting point?) — Grasshoper, Jan 04 '18 at 12:03

Random generators with multiple (uncorrelated?) distributions in c++

1 Answers1