It's gotta be thread_local
, or else use a mutex!
But is thread_local really sensible here?
Unless you are willing to to guard your RNG with a mutex, then you must have a separate RNG for each thread. So, yes, thread_local
is sensible.
When you have a large number of threads, however, this can be a problem, because std::mt19937
has a relatively large footprint. In that case, I suggest switching to something like PCG by Melissa O'Neill.
If your threads are not long-lived, std::mt19937
may present a performance problem due to the long time required to seed its large internal state. Once again, an easy solution would be to switch to something like pcg64
, by Melissa O'Neill. It can be seeded quickly, with just four calls to std::random_device
.
seed_randomly
, described below, can do the work for you.
Manage a pool of random engines and distributions
I say above that there are only two choices: use thread_local
, or else use a mutex. There is, however, a third alternative: the one presented here in the answer by @Marco Bonelli. That is to create (and manage) a pool of random number engines from which an RNG can be issued to a thread when needed.
If you are using one of the random number distributions from the Standard Library, such as uniform_int_distribution
, you also need to make sure you are not sharing an instantiation of it across threads. Many (almost all?) of the distributions have internal state, so you need to protect yourself when several different threads are contending for access.
Seeding
And how should g be seeded here if the generated samples are supposed to be independent?
std::mt19937
and std::mt19937_64
both have have 19,968 bits of state, so trying to seed them with a single seed of only 32 or 64 bits is problematic.
So long as std::random_device
is a good source of entropy on your system, a good way to thoroughly seed std::mt19937
, is to use std::random_device
to fill all 624 state variables of std::mt19937
.
I posted a small, single-file header in a repository named seed_randomly
on GitHub that uses std::random_device
to seed all 19,968 bits of a mersenne twister engine. It was created with the help of some of the smart guys here on StackOverflow. See this StackOverflow answer.
Before using this header, you should satisfy yourself that std::random_device
is a good source of entropy on your system. Sometimes, it is not.
Microsoft Visual C++, for instance, generates "non-deterministic and cryptographically secure" values, and never blocks, which is excellent. Prior to version 9.2, however, MinGW distributions of GCC used std::mt19937
with a fixed seed! Those systems generated the same sequence every time. (Newer versions purport to have fixed the problem, but I have not checked.) Unix-like systems often use /dev/random
(which can block) or /dev/urandom
. Both have their advantages.
Overlapping sequences can produce correlated results
The random seeding described above hits all 624 state variables of std::mt19937
. The odds are infinitesimal, therefore, that two such generators would produce sequences that overlap. However, that is not zero! Given the large state space of std::mt19937
, it probably is not a real concern.
For rigorous scientific and academic work, that may not be good enough. In those arenas, you may be required to prove that the RNG sequences do not overlap. In that case, you should investigate the practicality of calling member function discard
to jump ahead by a different multiple of some predetermined large amount on each thread.
That is guaranteed to be an O(1) operation with PCG. It can also be O(1) for a linear congruential generator, but there is no guarantee that a specific Standard Library implementation has done so. I do not know the status of std::mt19937
. My recollection is just vague enough that I cannot state it here.
By the way, std::mt19937
has mathematically proven statistical properties that make it suitable for certain rigorous applications where PCG cannot be used. That is because the randomness of PCG has not been proven a priori, using mathematics. Instead, long sequences produced by PCG are tested empirically, using programs such TestU01. The situation is a little bit ironic, because pcg64
outperforms std::mt19937
in some of the empirical tests.