15

I'm trying to implement the Parallel.ForEach pattern and track progress, but I'm missing something regarding locking. The following example counts to 1000 when the threadCount = 1, but not when the threadCount > 1. What is the correct way to do this?

class Program
{
   static void Main()
   {
      var progress = new Progress();
      var ids = Enumerable.Range(1, 10000);
      var threadCount = 2;

      Parallel.ForEach(ids, new ParallelOptions { MaxDegreeOfParallelism = threadCount }, id => { progress.CurrentCount++; });

      Console.WriteLine("Threads: {0}, Count: {1}", threadCount, progress.CurrentCount);
      Console.ReadKey();
   }
}

internal class Progress
{
   private Object _lock = new Object();
   private int _currentCount;
   public int CurrentCount
   {
      get
      {
         lock (_lock)
         {
            return _currentCount;
         }
      }
      set
      {
         lock (_lock)
         {
            _currentCount = value;
         }
      }
   }
}
John Landheer
  • 3,999
  • 4
  • 29
  • 53

6 Answers6

27

The usual problem with calling something like count++ from multiple threads (which share the count variable) is that this sequence of events can happen:

  1. Thread A reads the value of count.
  2. Thread B reads the value of count.
  3. Thread A increments its local copy.
  4. Thread B increments its local copy.
  5. Thread A writes the incremented value back to count.
  6. Thread B writes the incremented value back to count.

This way, the value written by thread A is overwritten by thread B, so the value is actually incremented only once.

Your code adds locks around operations 1, 2 (get) and 5, 6 (set), but that does nothing to prevent the problematic sequence of events.

What you need to do is to lock the whole operation, so that while thread A is incrementing the value, thread B can't access it at all:

lock (progressLock)
{
    progress.CurrentCount++;
}

If you know that you will only need incrementing, you could create a method on Progress that encapsulates this.

svick
  • 236,525
  • 50
  • 385
  • 514
26

Old question, but I think there is a better answer.

You can report progress using Interlocked.Increment(ref progress) that way you do not have to worry about locking the write operation to progress.

Roy T.
  • 9,429
  • 2
  • 48
  • 70
  • Should actually be the accepted answer sir. Adding some "random" lock in your code (incrementing a counter seems to be a good example) HAS TO be avoided everytime it's possible. Interlocked.Increment gives you this opportunity. – Rayyyz Mar 01 '21 at 19:22
1

The easiest solution would actually have been to replace the property with a field, and

lock { ++progress.CurrentCount; }

(I personally prefer the look of the preincrement over the postincrement, as the "++." thing clashes in my mind! But the postincrement would of course work the same.)

This would have the additional benefit of decreasing overhead and contention, since updating a field is faster than calling a method that updates a field.

Of course, encapsulating it as a property can have advantages too. IMO, since field and property syntax is identical, the ONLY advantage of using a property over a field, when the property is autoimplemented or equivalent, is when you have a scenario where you may want to deploy one assembly without having to build and deploy dependent assemblies anew. Otherwise, you may as well use faster fields! If the need arises to check a value or add a side effect, you simply convert the field to a property and build again. Therefore, in many practical cases, there is no penalty to using a field.

However, we are living in a time where many development teams operate dogmatically, and use tools like StyleCop to enforce their dogmatism. Such tools, unlike coders, are not smart enough to judge when using a field is acceptable, so invariably the "rule that is simple enough for even StyleCop to check" becomes "encapsulate fields as properties", "don't use public fields" et cetera...

The Dag
  • 1,811
  • 16
  • 22
0

Remove lock statements from properties and modify Main body:

 object sync = new object();
        Parallel.ForEach(ids, new ParallelOptions {MaxDegreeOfParallelism = threadCount},
                         id =>
                             {
                                 lock(sync)
                                 progress.CurrentCount++;
                             });
chameleon
  • 984
  • 10
  • 15
  • I think lock statement in loop body is atomic. Lock statement in properties is a separated, because propery is compiled in method. – chameleon Jan 24 '13 at 13:08
  • Because the increment operator involves both the call to the getter and the setter - x++ is shorthand for x = x + 1 - so the lock is taken before the value is read and released after it's updated. But I still think a field would be better. :) – The Dag Mar 25 '14 at 09:04
0

The issue here is that ++ is not atomic - one thread can read and increment the value between another thread reading the value and it storing the (now incorrect) incremented value. This is probably compounded by the fact there's a property wrapping your int.

e.g.

Thread 1        Thread 2
reads 5         .
.               reads 5
.               writes 6
writes 6!       .

The locks around the setter and getter don't help this, as there's nothing to stop the lock blocks themseves being called out of order.

Ordinarily, I'd suggest using Interlocked.Increment, but you can't use this with a property.

Instead, you could expose _lock and have the lock block be around the progress.CurrentCount++; call.

Rawling
  • 49,248
  • 7
  • 89
  • 127
  • I think exposing the lock is a bad idea, because it means anyone can use it, which can easily lead to deadlocks. I think a better solution would be to have the lock directly in `Main()`. – svick Jan 24 '13 at 12:52
  • If you've got code elsewhere that's doing something else with the `int`, it's easier for that code to lock on the same object rather than trying to share a separate lock object between everything. – Rawling Jan 24 '13 at 12:56
0

It is better to store any database or file system operation in a local buffer variable instead of locking it. locking reduces performance.

Ghebrehiywet
  • 884
  • 3
  • 12
  • 20