2

I'm having some hard time finding convincing sources on the thread safety of random number generation in C++.

The application I am working on currently has few threads accessing the random number generation, but it might have more in the future.

The first thing used was:

int randomInRange(const int min, const int max)
{
    srand(time(0));
    return min + (rand() % max);
}

From what I found out this is wrong in the sense that because the seed is set for every call, it impacts the statistical randomness of the rand() function and performance negatively.

Next thing tried was using a singleton which holds a single instance of std::mt19937:

// RandomHelper.h
class RandomHelper
{
private:
    std::mt19937 engine;

    // Singleton
    RandomHelper() : engine(std::random_device()()) {}

public:
    static RandomHelper& getInstance();

    int randomInRange(int min, int max)
    {
        std::uniform_int_distribution<int> uidist(min, max);
        return uidist(engine);
    }
};

// RandomHelper.cpp
RandomHelper& RandomHelper::getInstance()
{
    static RandomHelper instance;
    return instance;
}

Which should help with the statistical randomness. But for this I understand that it would be better to have a std::mt19937 instance per-thread.

I am not sure I understand why.

Is this because if multiple threads would call RandomHelper::getInstance().randomInRange(.., ..) at the exact same time they would get the same result? AKA it's not thread-safe?

To avoid the possibility of this it is now implemented as the following:

int randomInRange(const int min, const int max)
{
    thread_local std::mt19937 engine(std::random_device{}());
    std::uniform_int_distribution<int> uidist(min, max);
    return uidist(engine);
}

This looks to be the best solution so far, but I can't find much information on the thread safety of std::mt19937 or std::random_device to see whether it's actually needed to have a thread_local variant of those versus the single instance in the singleton.

I am willing to do more research but I have no idea on where to look other than the usual C++ references which don't mention anything about thread-safety as far as I can see.

Thizzer
  • 16,153
  • 28
  • 98
  • 139
  • Calling `srand( time( 0 ) );` every time is a very, very bad idea. And having one `std::mt19937` per thread seems a good idea, but you need to seed them with a different seed. – Jabberwocky Feb 28 '20 at 11:12
  • Would [this](https://stackoverflow.com/questions/53040940/why-is-the-new-random-library-better-than-stdrand) help you...? – NutCracker Feb 28 '20 at 11:13
  • And [this](https://stackoverflow.com/questions/40655814/is-mersenne-twister-thread-safe-for-cpp) should help too – Jabberwocky Feb 28 '20 at 11:15
  • @NutCracker From what I gather that mostly mentions that rand() isn't safe to use. And shows how to use the standard libraries, but doesn't mention anything about the thread safety of the engine. – Thizzer Feb 28 '20 at 11:16
  • @Jabberwocky Unfortunately that question has one answer saying it isn't thread safe and another saying it is. So I am not sure where to go with that. – Thizzer Feb 28 '20 at 11:17
  • I'd assume it's not thread safe and have one PRNG per thread, then you'll be on the safe side. – Jabberwocky Feb 28 '20 at 11:19
  • @Jabberwocky That is what I currently assume as well, but was hoping to find some documentation or alike to show me that it is needed without having to assume. – Thizzer Feb 28 '20 at 11:20
  • Have you read [How do I generate thread-safe uniform random numbers?](https://stackoverflow.com/questions/21237905/how-do-i-generate-thread-safe-uniform-random-numbers)? – David C. Rankin Feb 28 '20 at 11:23
  • I didn't see that one yet, there the consensus also appears to be 'assume that it is not thread safe'. So at least for now it looks like I'd have to stay with assuming that. Was hoping there'd be some documentation I'd missed to give me a final answer. – Thizzer Feb 28 '20 at 11:28

1 Answers1

2

Random number generator engines are not thread safe.

Using thread_local to initialize one engine per thread makes it thread safe, so I would say your solution is good.

If you have doubts about the thread safety of random_device, maybe make it thread_local also:

int randomInRange(const int min, const int max)
{
    thread_local std::random_device rd;
    thread_local std::mt19937 rng(rd());  
    thread_local std::uniform_int_distribution<int> uid;
    return uid(rng, decltype(uid)::param_type{min,max});
}

There are more threads that talk about this that weren't mentioned:

C++ thread-safe uniform distribution random number generation

C++11 Thread safety of Random number generators

I'm sure that there are more.

anastaciu
  • 23,467
  • 7
  • 28
  • 53
  • Do you know if there is any authoritative reference on the thread safety of the random number generators? – Thizzer Feb 28 '20 at 12:05
  • I also see some answers using the thread-id to initialise the seed. So for example: `std::hash()(std::this_thread::get_id())` or even `std::random_device()() + std::hash()(std::this_thread::get_id())`. Is that required or is random_device capable or creating a proper per-thread seed? – Thizzer Feb 28 '20 at 12:06
  • 1
    Q1: as far as I can tell, not specifically, maybe it exists, burried in the standard, but I don't see it, at least not in the Random number generation sub-chapter. Q2, Yes, random_device is perfectly capable, using thread id is basically the same thing, I would pick one, though, don't use both like the 2nd piece of code. – anastaciu Feb 28 '20 at 12:59
  • On https://en.cppreference.com/w/cpp/numeric/random/random_device it states *"std::random_device may be implemented in terms of an implementation-defined pseudo-random number engine ... In this case each std::random_device object may generate the same number sequence"* so that sounds to me like it would be safer to do both? – Thizzer Feb 28 '20 at 13:11
  • But do you have more than one implementation of random_device? – anastaciu Feb 28 '20 at 13:27
  • This will only be a potencial problem if you have another random_device with the same min and max implemented elsewhere and you need that sequence to be different. – anastaciu Feb 28 '20 at 13:35
  • 1
    @Thizzer To clarify, adding two seeds is something I wouldn't do, I have no proof that this won't result, for instance, in overflow, so I would use one or the other, if you have a situation like the described above, I would use thread id only. – anastaciu Feb 28 '20 at 13:38
  • 1
    @Thizzer -- the authoritative source is that there is nothing in the C++ standard that says that the standard random number generators are thread-safe. Types defined in the standard library are not thread-safe unless there is a specific requirement in the standard. – Pete Becker Feb 28 '20 at 14:12
  • @PeteBecker That's a good way to look at it. I guess the worse assumption would indeed be to assume that they are thread-safe while it isn't documented anywhere. – Thizzer Feb 28 '20 at 14:15
  • @anastaciu From how I understand the documentation it means that on some platforms/implementations the random_device might generate the same number sequence if it can't read from a required hardware source. – Thizzer Feb 28 '20 at 14:18
  • 1
    @Thizzer, you understand correctly, but only if you have another random_device implementation linked to a random number generator with the same min and max values will it generate the same sequence of integers, so unless this is the case in your program, and this second implementation must have a different sequence from the first, this will not be an issue. – anastaciu Feb 28 '20 at 14:45
  • @anastaciu No that's not the case, I am using just the default `std::random_device` implementation, so I guess that should be good. Just to confirm, if the program were to construct to `std::random_device` instances at the exact same time, the seed would still differ? – Thizzer Feb 28 '20 at 15:04
  • 1
    @Thizzer `result_type operator()() Returns: A non-deterministic random value, uniformly distributed between min() and max(), inclusive.It is implementation-defined how these values are generated`. So I would say no equal numbers are generated and this is not related to time. I'm a bit short in time, but you can check http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3797.pdf chapter 26.5, You can also check [std::seed_seq](https://en.cppreference.com/w/cpp/numeric/random/seed_seq) looks like a nice alternative too. – anastaciu Feb 28 '20 at 15:39