2

I'm using std::mt19937 to produce deterministic random numbers. I'd like to pass it to functions so I can control their source of randomness. I could do int foo(std::mt19937& rng);, but I want to call foo and bar in parallel, so that won't work. Even if I put the generation function behind a mutex (so each call to operator() did std::lock_guard lock(mutex); return rng();), calling foo and bar in parallel wouldn't be deterministic due to the race on the mutex.

I feel like conceptually I should be able to do this:

auto fooRNG = std::mt19937(rng()); // Seed a RNG with the output of `rng`.
auto barRNG = std::mt19937(rng());
parallel_invoke([&] { fooResult = foo(fooRNG); },
                [&] { barResult = bar(barRNG); });

where I "fork" rng into two new ones with different seeds. Since fooRNG and barRNG are seeded deterministically, they should be random and independent.

  1. Is this general gist viable?
  2. Is this particular implementation sufficient (I doubt it)?

Extended question: Suppose I want to call baz(int n, std::mt19937&) massively in parallel over a range of indexed values, something like

auto seed = rng();
parallel_for(range(0, 1 << 20),
             [&](int i) {
    auto thisRNG = std::mt19937(seed ^ i); // Deterministically set up RNGs in parallel?
    baz(i, thisRNG);

});

something like that should work, right? That is, provided we give it enough bits of state?

Update: Looking into std::seed_seq, it looks(?) like it's designed to turn not-so-random seeds into high-quality seeds: How to properly initialize a C++11 std::seed_seq

So maybe what I want something like

std::mt19937 fork(std::mt19937& rng) {
    return std::mt19937(std::seed_seq({rng()}));
}

or more generally:

//! Represents a state that can be used to generate multiple
//! distinct deterministic child std::mt19937 instances.
class rng_fork {
    std::mt19937::result_type m_seed;
public:
    rng_fork(std::mt19937& rng) : m_seed(rng()) {}  
    // Copy is explicit b/c I think it's a correctness footgun:
    explicit rng_fork(const rng_fork&) = default;

    //! Make the ith fork: a deterministic but well-seeded
    //! RNG based off the internal seed and the given index:
    std::mt19937 ith_fork(std::mt19937::result_type index) const {
        return std::mt19937(std::seed_seq({m_seed, index}));
    }
};

then the initial examples would become

auto fooRNG = fork(rng);
auto barRNG = fork(rng);
parallel_invoke([&] { fooResult = foo(fooRNG); },
                [&] { barResult = bar(barRNG); });

and

auto fork_point = rng_fork{rng};
parallel_for(range(0, 1 << 20),
             [&](int i) {
    auto thisRNG = fork_point.ith_fork(i); // Deterministically set up a RNG in parallel.
    baz(i, thisRNG);
});

Is that correct usage of std::seed_seq?

Ben
  • 9,184
  • 1
  • 43
  • 56
  • 2
    Your first example is what I would do if I wanted repeatability. – NathanOliver May 26 '22 at 01:53
  • Easiest way to "fork" `std::mt19936` is just make a copy. All the state info is in the copy so subsequent sequences match. As simple as `auto gen2=gen1;` Keep a separate copy in each thread. – doug May 26 '22 at 02:09
  • Why do you want seed to be random? Just make it fixed. – Severin Pappadeux May 26 '22 at 02:10
  • 1
    Are you going to trust your results if calls to rng() produced the same number and the same sequence? – Severin Pappadeux May 26 '22 at 02:15
  • @doug the simplest version of what I'm trying to avoid is `int foo(std::mt19936& rng) { return rng(); } int bar(std::mt19936& rng) { return rng(); }` and then doing `auto fooRNG = rng; auto barRNG = rng; return foo(fooRNG) + bar(barRNG);` <- that will always return `2 * rng()`. – Ben May 26 '22 at 02:46
  • Ah, you want rngs seeded differently, in that case you should make sure the seeds returned by `rng()` differ. Unlikely they don't but I think these seeds are just an unsigned int so you're taking a chance if you don't test that they are not the same. @SeverinPappadeux made the same point – doug May 26 '22 at 03:30

1 Answers1

3

I am aware of 3 ways to seed multiple parallel pseudo random number generators (PRNGs):

First option

Given a seed, initialize the first instance of the PRNG with seed, the second with seed+1, etc. The thing to be aware of here is that the state of the PRNGs will be initially very close in case the seed is not hashed. Some PRNGs will take a long time to diverge. See e.g. this blog post for more information.

For std::mt19937 specifically, however, this was never an issue in my tests because the initial seed is not taken as is but instead gets "mangled/hashed" (compare the documentation of the result_type constructor). So it seems to be a viable option in practice.

However, notice that there are some potential pitfalls when seeding a Mersenne Twister (which has an internal state of 624 32-bit integers) with a single 32 bit integer. For example, the first number can never be 7 or 13. See this blog post for more information. But if you do not rely on the randomness of only the first few drawn numbers but draw a more reasonable number of numbers from each PRNG, it is probably fine.

Second option

Without std::seed_seq:

Seed one "parent" PRNG. Then, to initialize N parallel PRNGs, draw N random numbers and use them as seeds. This is your initial idea where you draw 2 random numbers rng() and initialize the two std::mt19937:

std::mt19937 & rng = ...;
auto fooRNG = std::mt19937(rng()); // Seed a RNG with the output of `rng`.
auto barRNG = std::mt19937(rng());

The major issue to look out for here is the birthday problem. It essentially states that the probability to draw the same number twice is more likely than you'd intuitively think. Given a type of PRNG that has a value range of b (i.e. b different values can appear in its output), the probability p(t) to draw the same number twice when drawing t numbers can be estimated as:

p(t) ~ t^2 / (2b) for t^2 << b

(compare this post). If we stretch the estimate "a bit", just to show the basic issue: For a PRNG producing a 16 bit integer, we have b=2^16. Drawing 256 numbers results in a 50% chance to draw the same number twice according to that formula. For a 32 bit PRNG (such as std::mt19937) we need to draw 65536 numbers, and for a 64 bit integer PRNG we need to draw ~4e9 numbers to reach the 50%. Of course, this is all an estimate, so you want to draw several orders of magnitude less numbers than that. Also see this blog post for more information.

In case of seeding the parallel std::m19937 instances with this method (32 bit output and input!), that means you probably do not want to draw more than a hundred or so random numbers. Otherwise, you have a realistic chance of drawing the same number twice. Of course, you could ensure that you do not draw the same seed twice by keeping a list of already used seeds. Or use std::mt19937_64.

Additionally, there are still the potential pitfalls mentioned above regarding the seeding of a Mersenne Twister with 32 bit numbers.

With seed sequence:

The idea of std::seed_seq is to take some numbers, "mix them" and then provide them as input to the PRNG so that it can initialize its state. Since the 32 bit Mersenne Twister has a state of 624 32-bit integers, you should provide that many numbers to the seed sequence for theoretically optimal results. That way you get b=2^(624*32), meaning that you avoid the birthday problem for all practical purposes.

But in your example

std::mt19937 fork(std::mt19937& rng) {
    return std::mt19937(std::seed_seq({rng()}));
}

you provide only a single 32 bit integer. This effectively means that you hash that 32 bit number before putting it into std::mt19937. So you do not gain anything regarding the birthday problem. And the additional hashing is unnecessary because std::mt19937 already does something like this.

std::seed_seq itself is somewhat flawed, see this blog post. But I guess for practical purposes it does not really matter. A supposedly better alternative exists, but I have no experience with it.

Third option

Some PRNG algorithms such as PCG or xoshiro256++ allow to jump over a large number of random numbers fast. For example, xoshiro256++ has a period of (2^256)-1 before it repeats itself. It allows to jump ahead by 2^128 (or alternatively 2^192) numbers. So the idea would be that the first PRNG is seeded, then you create a copy of it and jump ahead by 2^128 numbers, then create a copy of that second one and jump ahead again by 2^128, etc. So each instance works in a slice of length 2^128 from the total range of 2^256. The slices are stochastically independent. This elegantly bypasses the problems with the above methods.

The standard PRNGs do have a discard(z) method to jump z values ahead. However, it is not guaranteed that the jumping will be fast. I don't know whether std::mt19937 implements fast jumping in all standard library implementations. (As far as I know, the Mersenne Twister algorithm itself does allow this in principle.)

Additional note

I found PRNGs to be surprisingly difficult to use "right". It really depends on the use case how careful you need to be and what method to choose. Think about the worst thing that could happen in your case if something goes wrong, and invest an according amount of time in researching the topic.

For ordinary scientific simulations where you require only a few dozens or so parallel instances of std::mt19937, I'd guess that the first and second option (without seed sequence) are both viable. But if you need several hundreds or even more, you should think more carefully.

Sedenion
  • 5,421
  • 2
  • 14
  • 42
  • This is exactly the sort of depth I knew I didn't have and was looking for. Thank you! My plan is to encapsulate this in a move-only class that models random-number-engine in an effort to put all the difficulty in that class and make it hard to abuse. – Ben May 31 '22 at 23:32
  • 1
    @Ben Glad I was able to help :-) If the answer is sufficient for you, could you accept it, please? – Sedenion Jun 01 '22 at 05:50