17

I have a C++11 program that needs to create several independent random generators, for use by different threads in a parallel computation. These generators should be initialized with different seed values so that they all produce different pseudo-random sequences.

I see that there's a std::seed_seq class that seems to be meant for this purpose, but it's not clear what's the right way to construct one. The examples I've seen, such as the one on cppreference.com, initialize it with a handful of integer constants hard-coded in the program:

std::seed_seq seq{1,2,3,4,5};

I doubt that's actually a recommended best practice, so I'm wondering what is the recommended practice. In particular:

  • Since a seed_seq can be initialized with an arbitrary number of integers, what's the significance of the length of its initializer list? If I want to produce seeds for 100 random generators, do I need to initialize my seed_seq with 100 integers?
  • If the length of the initializer list doesn't have to match the number of seeds I intend to generate, is it OK to initialize a seed_seq with just one integer and then use it to produce a large number of seeds?
  • How about initializing with no integers, i.e. using the default constructor? (This means I'd get the same seeds every time, of course.)
  • If it's OK to construct a seed_seq from a single integer and then generate lots of seeds from it, what's the benefit of using seed_seq instead of an ordinary random generator? Why not just construct a std::mt19937 from that single integer and use that to produce seed values for other generators?
Wyzard
  • 33,849
  • 3
  • 67
  • 87

2 Answers2

11

The trouble with using a fixed sequence like that is that you get the same sequence of seeds out of it, much the same as if you had called srand(42) at the start of your program: it generates identical sequences.

The C++11 standard states (in section 26.5.7.1 Class seed_seq):

A seed sequence is an object that consumes a sequence of integer-valued data and produces a requested number of unsigned integer values i, 0 i < 232, based on the consumed data.

[Note: Such an object provides a mechanism to avoid replication of streams of random variates. This can be useful, for example, in applications requiring large numbers of random number engines. —end note]

It also states how those integers are turned into seeds in paragraph 8 of that section, in such a way that the distribution of those seeds is acceptable even if the integer input items are very similar. So you can probably think of it as a pseudo-random number generator for seed values.

A larger number of items will provide more "randomness" in the seed values, provided they have some randomness themselves. Using constants as input is a bad idea for this reason.

What I tend to do is very similar to the way you normally randomise one generator, with srand (time (0)). In other words:

#include <random>
#include <cstdint>
#include <ctime>
#include <iostream>
 
int main()
{
    std::seed_seq seq{time(0)};
    std::vector<std::uint32_t> seeds(10);
    seq.generate(seeds.begin(), seeds.end());
    for (std::uint32_t n : seeds) {
        std::cout << n << '\n';
    }
}

If you have multiple sources of randomness, such as a value read from /dev/random under Linux, or a white noise generator of some description, or the average number of milliseconds between keypresses the last time a user ran this program, you could use those as extra inputs:

std::seed_seq seq{time(0), valFromDevRandom(), getWhiteNoise(), avgMillis()};

but I doubt constants are the way to go, since they add no randomness to the equation.

Community
  • 1
  • 1
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • Hmm… initializing with multiple entropy sources makes sense, and I guess initializing with just one integer shouldn't be any worse than seeding a plain random generator with a single integer. – Wyzard Mar 20 '14 at 03:55
  • The standard actually specifies the exact algorithm used by `std::seed_seq` (see rand.util.seedseq paragraph 8), and [I'm told](http://stackoverflow.com/a/15509942/365496) this algorithm is a warm-up sequence for a random number generator. – bames53 Mar 20 '14 at 04:26
  • @bames53, hmm, that example in your answer looks like I should be using a whole separate `seed_seq` for each generator, to ensure that each generator's initial state is fully warmed-up. I was thinking I'd use a single `seed_seq` to produce one seed integer per generator. – Wyzard Mar 20 '14 at 05:34
  • Maybe I ought to create one `seed_seq` from a suitable entropy source and use it to generate a single seed integer for each thread, then within each thread, use the integer to create *another* `seed_seq` with which to seed the thread's generator. – Wyzard Mar 20 '14 at 05:45
  • @Wyzard, or you could go even further and use the output of that last `seed_seq` to instantiate yet another `seed_seq` from which you would extract values for instantiating the actual number generators. At some point, you've got to stop and just start using the numbers :-) – paxdiablo Mar 20 '14 at 05:58
  • I liked it better when I was using Boost Random and I could just pass one generator object to the `seed` method of another. (Though, in retrospect, I think that was more of a lucky accident than an actual feature.) – Wyzard Mar 20 '14 at 06:07
  • @Wyzard I think creating many generators from a single seed_seq is fine. You can put as much random data into a seed seq as you like. – bames53 Mar 20 '14 at 14:30
  • This is a very old, unaccepted answer. I downvoted for many reasons, (1) *Using constants as input is a bad idea for this reason.* There are many applications where repeatability is not just desirable but is essential. Consider the case of a Monte Carlo simulation of a docking spacecraft, where 10 out of 10000 runs result in disaster. This result may or may not be acceptable. In either case, it would be nice to be able to reproduce each of those ten disaster cases. This would not be possible if /dev/random was used to seed the PRNG. – David Hammen Jun 15 '20 at 11:06
  • 1
    Reason (2) Using `time(0)` is widely regarded as a very bad way to seed a PRNG. (3) The answer mentions `\dev\random` but fails to mention `std::random_device`. (4) Why just four random values? Why not 624, the size of `std::mt19937`'s internal state? – David Hammen Jun 15 '20 at 11:11
  • @David, (1) If you want repeatability, then that's *not* random so, yes, you can use non-random seeds for that. *This* question was calling for random values so random seeds are the best way to go. (2) Without a citation for support, this is meaningless. I know full well that time is a bad seed for security purposes but it's perfectly okay for many general purposes. In any case, I state that *multiple* entropy sources are preferable, of which current time is only one. (3/4) Those were just *examples* of entropy sources. If you have 624 of them available to you, by all means use them. – paxdiablo Jun 15 '20 at 12:37
0

According The C++11 standard (in section 26.5.7.1.8),seed_seq can generate a sequence which is likely generated by a hash function, uniformly and randomly in the range.

I try to answer the below questions:

Q1 "Since a seed_seq can be initialized with an arbitrary number of integers, what's the significance of the length of its initializer list? If I want to produce seeds for 100 random generators, do I need to initialize my seed_seq with 100 integers?"

A1. You needn't initialize seed_seq with a lot integers. Even seed_seq initialized by one random integer, the generated sequence keep the randomness. But you initialize seed_seq with more integers and in the wider range, The generated sequence is more hardly "collide" by attackers.

Q2. "If the length of the initializer list doesn't have to match the number of seeds I intend to generate, is it OK to initialize a seed_seq with just one integer and then use it to produce a large number of seeds?"

A2. Yes, it is OK to initialize a seed_seq with just one integer if you don't need cryptographically secure level.

Q3. "How about initializing with no integers, i.e. using the default constructor? (This means I'd get the same seeds every time, of course.)"

A3. You will get the identical sequences by the default constructed seed_seq runs more. Thus it will became a security hole.

Q4. "If it's OK to construct a seed_seq from a single integer and then generate lots of seeds from it, what's the benefit of using seed_seq instead of an ordinary random generator? Why not just construct a std::mt19937 from that single integer and use that to produce seed values for other generators?"

A4. seed_seq is a light-weight algorithm, only iterates the filled sequence 3 times. I guess you can use other random generator instead of seed_seq.

ligand
  • 182
  • 1
  • 5