Using Random in Parallel.For

Question

My program performs many simulations, each of which calls for many random numbers to be generated. The serial method is straightforward and works. However, in my pursuit of parallelizing the work I believe I created a more straightforward method than what I could find. The other methods are somewhat dated and something might be possible now that wasn't possible then.

Am I missing something that will make my method susceptible to any of the myriad of multithreading problems? My method uses the ability of a Parallel.For to instantiate a variable for individual thread use and thus it doesn't require another class like the other methods I found. In this case each thread gets its own Random.

Timing:

My method: 4s

Stephen: 14s

Jon: 16s

Clearly I don't know as much as Stephen or Jon so I'm concerned I missed something.

My method:

Random rnd = new Random();
int trials = 1_000_000;

private readonly object globalLock = new object();
ParallelOptions po = new ParallelOptions();
po.MaxDegreeOfParallelism = 4;

await Task.Run(() =>
{
    Parallel.For<Random>(0, trials, po, 
    () => { lock(globalLock){ return new Random(rnd.Next()); } }, 
    (i, loop, local) =>
    {
        for (int work = 0; work < 1000; work++)
        {
            local.Next();
        }

        return local;
    },
        (x) => { }
    );
});

This next method is by Stephen Toub on the MSDN Blog:

public static class RandomGen2
{
    private static Random _global = new Random();
    [ThreadStatic]
    private static Random _local;

    public static int Next()
    {
        Random inst = _local;
        if (inst == null)
        {
            int seed;
            lock (_global) seed = _global.Next();
            _local = inst = new Random(seed);
        }
        return inst.Next();
     }
}

await Task.Run(() =>
{
    Parallel.For(0, trials, i =>
    {
        for (int work = 0; work < 1000; work++)
        {
            RandomGen2.Next();
        }
    });

});

This next method is by Jon Skeet on his blog:

public static class ThreadLocalRandom
{
    private static readonly Random globalRandom = new Random();
    private static readonly object globalLock = new object();

    private static readonly ThreadLocal<Random> threadRandom = new ThreadLocal<Random>(NewRandom);

    public static Random NewRandom()
    {
        lock (globalLock)
        {
            return new Random(globalRandom.Next());
        }
    }

    public static Random Instance { get { return threadRandom.Value; } }

    public static int Next()
    {
        return Instance.Next();
    }
}

await Task.Run(() =>
{
    Parallel.For(0, trials, i =>
    {
        for (int work = 0; work < 1000; work++)
        {
            ThreadLocalRandom.Instance.Next();
        }
    });
});

Update/answer: Brian has pointed out that I was using Jon's method incorrectly. A more correct way would be to call an ThreadLocalRandom.Instance for each Parallel.For loop and use that instance for the internal for loop. This prevents the thread check on each call and instead there is only one thread check per Parallel.For loop. Using Jon's method correctly makes his method faster than the overload of Parallel.For that I was using.

This isn't threadsafe; you can't use one instance of Random on multiple threads, but that's exactly what you're doing, to make the seeds. Notice in both Stephen and Jon's solutions they have a lock around all usages of the seeding random. — Eric Lippert, Jan 24 '18 at 23:27

Eric Lippert · Answer 1 · 2018-01-24T23:38:20.927

5

However, in my pursuit of parallelizing the work I believe I found a more straightforward method than what I could find.

It's more straightforward, but wrong.

The other methods are somewhat dated.

What on earth does this even mean?

Am I missing something that will make my method susceptible to any of the myriad of multithreading problems?

The most basic rule of thread safety is: you can't use a not-threadsafe object on multiple threads without a lock. Random is not threadsafe, but you use the same one on every thread to calculate the seed.

Note that Jon and Stephen's "dated" methods correctly lock around the seeding random.

Clearly I don't know as much as Stephen or Jon so I'm concerned I missed something.

First, you should thoroughly internalize the basic rules of thread safety before writing any more multithreaded code.

Second, your attitude was your error. The correct attitude is: Jon and Stephen are both experts and their solutions contain no unnecessary parts. If you think that you've found a solution that lacks a part that their solutions have, then you need to explain why your solution does not need the part that their solution has.

edited Jan 24 '18 at 23:38

answered Jan 24 '18 at 23:31

Eric Lippert

647,829
179
1,238
2,067

2

Thanks for answering. Dated means several years old and something might be possible now that wasn't possible then. I did miss that problem. I'll update with a fix. – Jan 24 '18 at 23:37
Of the nearly 200 words in this answer only 37 answer the question at all. I would expect this kind of hostile response from a Linux Usenet forum in the 90's not StackExchange. – Sobachatina Jan 25 '18 at 18:01
2

@Sobachatina: No hostility is intended; the question was "is this code not safe?" and the answer is clearly given: the most basic aspect of thread safety is wrong. **I study wrong code for a living**. I design static analyzers which identify wrong code, and I study *what user factors lead people to write wrong code*. An important aspect of wrong code production is the attitude that one need not understand an existing solution in order to "improve" upon it. Now, if you have a better answer that you'd like to see, *post that answer*, and then we all benefit. – Eric Lippert Jan 25 '18 at 18:05
@EricLippert- Thank you for your perspective. A better answer is the one that was accepted. It explains when the lock is necessary and why one way was faster than the other in an unimpassioned tone. This answer started by saying the solution is wrong (without justification), criticized his choice of words, and then ended by saying that he shouldn't even try until he's an expert. I appreciate that you didn't mean to sound hostile. – Sobachatina Jan 25 '18 at 18:16
@Sobachatina: Absolutely no one should write multithreading code until they are expert programmers who have carefully studied multiple memory models and have a thorough, ingrained understanding of the models enforced by both chips and compilers. Multithreaded code is some of the hardest code there is to write. As I noted, I study how code comes to be wrong, and from two decades of study I have concluded that I am certainly not qualified to write anything but the very simplest, most obviously correct multithreaded code. – Eric Lippert Jan 25 '18 at 18:19
@EricLippert- Fair enough. I agree it's complicated. I suppose I just bristle whenever I hear someone say that I shouldn't even try until I'm an expert at something. You become an expert only by doing- studying will only get you part way. Also- there is a lot of very low risk multithreaded software out there. It's not like he's writing custom encryption algorithms here. (At least I hope he isn't.) – Sobachatina Jan 25 '18 at 18:23
@Sobachatina I've read through Brian's answer several times now. I've yet to see where it explains what the OP did wrong, why it's not safe, how to fix it and/or how the fix works. All I see it saying is that the OP's code is correct. I know that it hurts to be told that you made a mistake, but no, it's not nicer to just ignore the mistakes someone makes and pretend that they didn't make any. All the more so when they're *asking what mistakes they've made*. – Servy Jan 25 '18 at 19:10
1

@Servy: The original answer by the OP was missing a lock. My answer was posted after the OP had edited his question to fix this issue. – Brian Jan 25 '18 at 19:12
@Brian I'm aware. The point being that radically changing the question after getting an answer, and then saying that an answer posted before the change isn't good isn't appropriate. (And of course editing a question into a different question isn't appropriate on SO to begin with.) But all of that aside, Sobachatina specifically said that your answer explained that mistake, but with a better tone. Since it doesn't, that claim is false. – Servy Jan 25 '18 at 19:16

Brian · Accepted Answer · 2022-11-23T19:38:47.707

Your code is faster because it is simpler. Your code gives each loop a dedicated instance of Random. Jon and Stephen's code do so as well, but in their code every access to Random must check which thread is in use and then pull the correct instance of Random. Stephen's code is faster than Jon's code because ThreadLocal (which is a wrapper around ThreadStatic) is slightly slower.

However, the nice thing about their code is that their code provides a simple replacement for Random. Your approach puts the responsibility on the parallel code to initialize Random. In a real-world problem, carrying an instance of Random across the various supporting functions is kind of a hassle compared to having a static, thread-safe Random service.

In a real-world tasks, your functionality is probably not dominated by calls to Random. So under normal circumstances the slight performance loss from their code is fine.

I recommend ThreadLocal<T> over ThreadStatic (see ThreadStatic v.s. ThreadLocal<T>: is generic better than attribute? for discussion).

By the way, please never use lock with anything except a dedicated lock object. Like Jon (https://codeblog.jonskeet.uk/2008/12/05/redesigning-system-object-java-lang-object/), I really wish lock did not even support arbitrary objects.

Update:
.NET 6 adds Random.Shared, which "provides a thread-safe Random instance that may be used concurrently from any thread. " For new code, Random.Shared offers a simplified way to use Random in a thread-safe manner.

My use is real-world but is dominated by generating random numbers. I appreciate the explanation and do see the extra utility from the other methods. — , Jan 25 '18 at 15:25
@AdamKingsley: I still recommend Jon's code over yours. If you're unsatisfied with his performance, replace `for (int work = 0; work < 1000; work++) {ThreadLocalRandom.Instance.Next();}` with `Random rand = ThreadLocalRandom.Instance;for (int work = 0; work < 1000; work++){rand.Next();}` (which, incidentally, seems to be even faster than your code). — Brian, Jan 25 '18 at 18:58
I will spend some time putting your implementation of his code in the actual simulation generation and see if the performance is as fast. I would always rather use Jon's code. — , Jan 25 '18 at 19:10
I implemented it the way you stated. It is indeed very fast and easier to read. The implementation of his code in that way is what I had missed this whole time. Thanks. — , Jan 25 '18 at 23:01

Using Random in Parallel.For

2 Answers2