4

I need to check if any iterations of a Parallel.ForEach reach a specific point. From my understanding the following would be safe if my bool was a volatile field, but it needs to be a variable in the enclosing method:

bool anySuccess = false;
Parallel.ForEach(things, thing =>
{
    // stuff happens...
    anySuccess = true;
});
if(!anySuccess)
    throw new Exception("No things succeeded :(");

Can I use this code as-is or should I use a lock or Interlocked function?

Jussi Kosunen
  • 8,277
  • 3
  • 26
  • 34
  • If the only write to `anySuccess` is to set it as true and you never read it until after the processing has finished, then you will probably get away with it. But I would advise using a lock anyway. – stuartd Sep 30 '16 at 09:56
  • 3
    You can also use Volatile.Read at last (!anySuccess) check. – Evk Sep 30 '16 at 09:58
  • In this specific case, the only operation that can happen on anySuccess is that it is set to true. This means there never can be any race condition between different threads, like one thread setting anySuccess to false and another setting it to true at the same time. So here you don't even need locks or parallel foreach; even if you fire all threads at the same moment, nothing bad can happen. – Thern Sep 30 '16 at 10:02
  • At minimum a `Volatile.Read` at the end is necessary, otherwise the compiler is within its rights to use a cached value. Even if it doesn't, the processor's cache may simply not have caught up with the update yet. You're unlikely to hit this problem in practice, but with threaded code "unlikely" is as good as "bug". – Jeroen Mostert Sep 30 '16 at 10:15
  • @JeroenMostert: Can you explain this in more detail? The call to anySuccess is made after the whole Parallel.ForEach loop is completed. Why should the compiler be still allowed to take a cached value from a point where the loop was not executed completely? – Thern Sep 30 '16 at 10:26
  • @Nebr: you're right, I retract that. I don't actually have the fortitude to comb through the C# spec to see if the compiler is allowed, in this particular case, to cache the value of the local. (Note that there is no "loop". There's only a call to a method taking a delegate.) Fortunately it doesn't matter because the point about hardware caches stands. The compiler is certainly under no obligation to know `Parallel.ForEach` does something on another thread and to issue a volatile read, and a non-volatile read isn't enough in this case. – Jeroen Mostert Sep 30 '16 at 10:28
  • @JeroenMostert: Why should a non-volatile read be not enough in this case? Volatile just inserts a memory barrier to prevent reordering, but since the Parallel.ForEach guarantees that all threads have finished and it returns synchronously, there is no way for the compiler to reorder the reading before any of the writing commands, even without memory barriers. If the compiler would be allowed to reorder read and write issues from after the "loop" into the "loop" (I am missing the correct word for this, as it indeed is not really a loop), I would consider that a serious bug. – Thern Sep 30 '16 at 10:41
  • @Nebr: I retract *all* of my comment. I would use `lock` here because, even after all this time, I still don't truly understand what is and isn't guaranteed with concurrent access, and since it's obviously not a performance hotspot in any way in this case (the write happens only once at the end) I'd go for safety above careful armchair reasoning that this may or may not work, and leave lock-free code to the experts. See also [this question](http://stackoverflow.com/questions/21652938/) for more discussion, which doesn't really clear things up for me at all. – Jeroen Mostert Sep 30 '16 at 10:50

2 Answers2

4

If that code is the ENTIRETY of the loops' access to that boolean, then yes that seems safe to me.

In general basic value types aren't thread safe in the sense that many of the operations performed on them aren't atomic.

But if the ONLY thing you're ever doing is assigning to that variable ... never reading it, never changing branch based on it, never writing to it based on it's current state ... and all the writes are the same (the ONLY modification that can happen is to set it to true) then I don't see any way for the non-atomicity to cause a problem.

==ADDITION==

The above comments on the correctness of the current code. It is also worth considering the long-term safeness of the code in it's context as part of your codebase. Leaving that code as it is, is setting up a trap for a future developer who doesn't know/understand/recognise what is going on, and why it's CURRENTLY safe.

If you leave it like this, it will be essential to put a CLEAR comment on both the declaration and the single usage explaining what's going on, why it's safe, and why they musn't start reading/using the variable in other ways.

The alternative (adding locking code) is long-term safer, but likely to also be very slightly less performant.

Brondahl
  • 7,402
  • 5
  • 45
  • 74
  • And what if thread which reads anySuccess wont see it was changed, because it will read value (false) cached in CPU register? Don't you need Volatile.Read there? – Evk Sep 30 '16 at 10:14
  • @Evk Nothing IS reading it ... that's the entirety of the point of my answer. – Brondahl Sep 30 '16 at 10:24
  • How's that? And if(!anySuccess) block? – Evk Sep 30 '16 at 10:25
  • @Evk I feel pretty confident that any variable modifications would be guaranteed to be complete once all the Parallel Threads have resolved (i.e. before the Parallel.ForEach method has completed). And thus that the next line (the if) is guaranteed to get the right value. Comments on the question itself seem to be debating that, and I'm unclear on the conclusion that they reach. – Brondahl Sep 30 '16 at 10:30
  • @Brondahl do you have any documentation supporting that guarantee? I was thinking in the same thing and I am not sure it is that way. – vtortola Sep 30 '16 at 10:48
  • @vtortola no, I don't. And it seems like Jereon and Nebr are a lot close to having it than I am. – Brondahl Sep 30 '16 at 11:47
  • 3
    @vrtola Parallel for each obviously is a synchronization poin (which implies a full memory barrier), otherwise the whole method would be pointless and also impossible to implement. Same goes for thread.Join and all the task equivalents. – Voo Sep 30 '16 at 12:23
-2

I would lock the bool. Also, I would suggest moving your if statement into the Parallel.Foreach block. Reason:

This will evaluate after the parallel loop, thus reading only the last update of anySuccess

bool anySuccess = false;
Parallel.ForEach(things, thing =>
{
    // stuff happens...
    anySuccess = true;
});
if(!anySuccess)
    throw new Exception("No things succeeded :(");

This will evaluate all updates of the bool

Object myLock = new Object();

bool anySuccess = false;
Parallel.ForEach(things, thing =>
{
    // stuff happens...
    lock (myLock) 
    {
        anySuccess = true;
    }
    if(!anySuccess)
        throw new Exception("No things succeeded :(");
});
ma11achy
  • 133
  • 2
  • 14
  • This is a change of functionality. Given that anySuccess is not set to true for the first "thing" reaching this point, this will terminate immediately, even if not all things have been processed yet. With the original code, it is ensured that all things are processed correctly. – Thern Sep 30 '16 at 10:17