2

For a course I am trying to implement a parallel Monte Carlo simulation. One of the requirements of the project is that the exact results should be repeatable. In my current design I have a job 'queue' implemented implemented by an instance of std::default_random_engine and a set of workers. A worker takes a 'job' and uses it as a seed for its own rng (also an instance of std::default_random_engine) and runs the simulation using that instance.

While the results are indeed repeatable, it does leave me with some questions:

  • Does using a rng to seed a pool of rngs create some kind of bias? If it depends on which rng is used (and it probably does), which one should i use?
  • Is this design all right or are there alternatives?

--edit-- After having looked into it a bit further I notice the following: std::default_random_engine uses a simple linear congruential generator. Its state is simply its previous value. This means the results of my pool of rngs are exactly the same, but shifted by one. The mersenne twister (which has a larger internal state) doesn't show this behaviour:

#include <iostream>
#include <random>

int main(int argc, char** argv)
{
  std::default_random_engine e1(1);
  std::default_random_engine e2(e1());
  std::cout << e1() << ",\n" << e2() << '\n';

  std::mt19937_64 e3(1);
  std::mt19937_64 e4(e1());
  std::cout << e3() << ",\n" << e4() << '\n';
}

which outputs:

282475249,
282475249
2469588189546311528,
5601807455758261240
Lanting
  • 3,060
  • 12
  • 28

4 Answers4

2

Does using a rng to seed a pool of rngs create some kind of bias? If it depends on which rng is used (and it probably does), which one should i use?

With a good quality rng (any of the newly introduced C++11 ones, but not the old rand()), the basic idea's sound, but you wouldn't want to use the same RNG for generating the other RNGs' seeds. The reason is as follows: say your "master" RNG generates the sequence A B C D, and you seed the other RNGs with A B and C D: the first will then continue on to B C D..., the second to C D etc: basically each one is producing the same sequence starting one further in. If you a different RNG for the master, this will only happen if some of your A B C values are coincidentally close together in the other RNGs sequence, such that one RNGs eventually starts repeating another's sequence. The period of some of the C++11 RNGs is so massive though that they're pretty unlikely to have clashes in all but the most demanding simulations, and such repetition isn't necessarily that bad - depends on the simulation.

Further, you might want to make sure the same seed value isn't used twice though. (Of course, any given sequence you're repeating might not be statistically typical, or - potentially equally problematic - might not capture the atypical extremes, but that's unavoidable if you want repeatability.)

Is this design all right...

Sounds fine as you've described it.

Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
  • I've edited the question. The default rng seems to have some rather undesirable properties. – Lanting Jan 16 '15 at 12:08
  • @Lanting: ah yes - sorry - have seen that kind of thing before, but didn't come to mind. I hope my explanation above helps. It's not really about default vs. Mersenne etc., but about using the same RNG. – Tony Delroy Jan 16 '15 at 12:25
2

It depends. A PRNG actually generates a very long sequence of deterministic (non-random) values, based on its internal state. If the PRNG used for seeding is related to the PRNGs it is seeding, depending on how the seed initializes internal state, each of the seeded generators could easily generate in step, with the first value of the last being the second valud of the next to the last, and so on.

In practice, you're probably better with just incrementing the seed for each generator, rather than using a PRNG to get the next seed. This should work unless the PRNGs you're using just increment as well (which would be a very poor PRNG).

As PJS points out in a comment, if you do this, throw out the first random number generated. Some generators will use a very deterministic formula to convert the seed to the internal state, and return a value based on the old state when they generate a random number.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • Be careful with iterating through sequential seeds for an LCG, throw away the first generated value from each sequence. They're strongly correlated - see an example [here](http://stackoverflow.com/questions/27760450/first-random-number-after-setseed-in-java-always-similar/27760541?noredirect=1#comment43932113_27760541). – pjs Jan 16 '15 at 15:03
  • @pjs Good point. Now that you mention it, I've seen generators whose first return value after seeding was the seed. I've added a notice to this effect. Thanks. – James Kanze Jan 17 '15 at 11:11
0

Just use a PRNG with an internal state vector so large that the risk of hitting overlapping sequences is near enough to zero. MT, at 2496 bytes of internal state, certainly qualifies. Seed each with a full 2496-byte seed taken from a good hardware source like /dev/urandom or random.org, and the odds of correlation will be zero for practical purposes. I don't know if your library allows you to use a full-sized seed, so if not, find one that does.

Lee Daniel Crocker
  • 12,927
  • 3
  • 29
  • 55
0

Frankly, even with large internal state noone could guarantee that sequences won't overlap. On top of that, there is question of being debuggable, which requires problem being reproducible. So answer is, each event starts with it's own and stable (over different hardware/node/runs) seed. It makes result reproducible AND debuggable.

Well-known Monte Carlo code MCNP5+ used this scheme for good, runs on multi-cores and MPI. To implement it you'll need RNG with fast skip-ahead (a.k.a. leapfrog or discard) feature. And there are quite a few of them. They are based upon fast exponent computation, paper by F. Brown, "Random Number Generation with Arbitrary Stride", Trans. Am. Nucl. Soc. (Nov. 1994). Basically, skip-ahead is log(N) with Brown approach.

Simplest version which is about the same as MCNP5 one is here https://github.com/Iwan-Zotow/LCG-PLE63

More complicated (and slower, but higher quality) RNG is here http://www.pcg-random.org/

Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64