0

I have this DataTable:

DataTable dt = GetDatatTable();

One of its column is Amount (decimal)

I want to summarize it as fast as I can using TPL.

  object obj  = new Object();
  var total=0m;
  Parallel.For (1, dt.Rows.Count+1  ,i => {lock (obj) total += Decimal.Parse(dt.Rows[i-1]["Amount"]) });

But I really dont want to lock around many times.

Question #1

Is there any other alternative which reduce the extensive locks ?

Question #2

I don't understand why should I need to protect the total accumulator

  • Does the protection is for the += or for multi thread updating the total ?

    I mean look at the following flow , a Volatile field can solve it easily.

    let's say total=0
    and the DataTable items are 1,2,3

    1) first thread : total= total+1. ( total=1)

    2) second thread : total = total+___stop__ ( context switch , thread 3 comes in with value 3) ___val=_3____ ( total =1+3=4)

    3) context switch back to thread 2 total=4+2 = 6.

    so everything seems to be fine .

I must be missing something here.

P.s. I know I can do it with :

ParallelEnumerable.Range (1, dt.Rows.Count+1).Sum (i => Decimal.Parse(dt.Rows[i-1]["Amount"]) )

But I want to learn to do it with Parallel.For

Community
  • 1
  • 1
Royi Namir
  • 144,742
  • 138
  • 468
  • 792
  • 1
    There's a little comment about `volatile` [here](http://stackoverflow.com/a/154803/1180426). You could look into [PLINQ](http://msdn.microsoft.com/en-us/magazine/cc163329.aspx) (that is, `AsParallel().Sum(x => ...)` if that fits your needs of course; I haven't even touched raw data tables in a long time so it's hard for me to say what the best course of action is... – Patryk Ćwiek Jan 29 '13 at 08:03
  • Alright, then personally I see no other easy way but to use locks in this particular situation. Maybe someone else will have a better idea :) – Patryk Ćwiek Jan 29 '13 at 08:07
  • BTW, I think you have a off-by-one bug in your code. If you want to have 1-based `i` (though I don't see any reason for that here), you need `Parallel.For(1, count + 1, …)`. – svick Jan 29 '13 at 13:24
  • @svick yeah you right . I wrote it on the fly just for the question but I'll fix it. thanks. – Royi Namir Jan 29 '13 at 13:25

2 Answers2

1

Since you need to use locking to ensure a correct result, I don't think Parallel.For is buying you anything. You cannot lock something in parallel; by definition, locking is done in series.

So, a simple for loop would be just as performant, and much easier to work with.

Roy Dictus
  • 32,551
  • 8
  • 60
  • 76
1

Yes, there are alternatives to reduce the locks:

  1. Use the overload of Parallel.For() that supports local data. This way, you need synchronization only in the localFinally delegate (but you shouldn't forget it there).
  2. Use Interlocked.Add(). This won't work in your case, because there are overloads only for int and long, not for decimal.
  3. Don't use parallel processing. With a very simple operation like this one, it's quite possible that the overhead of parallel processing will be more than the gains in speed.
  4. Use PLINQ:

    var total =
        ParallelEnumerable.Range(0, dt.Rows.Count)
                          .Select(i => Decimal.Parse(dt.Rows[i]["Amount"]))
                          .Sum();
    

Regarding your thread-safety question, you're assuming that after the “context switch” (I use scary quotes, because on multicore CPUs, there doesn't have to be any context switch for this issue to occur), the thread will read the current value of total again. But in fact, it already read the old value, which is now saved in a register. So, the result in step 3 will become 1 + 2 = 3.

svick
  • 236,525
  • 50
  • 385
  • 514
  • Thanks svick , isn't volatile should read the most current value ? ( im sorry I dont understand how you got to 1+3=3. can you please elaborate ?) – Royi Namir Jan 29 '13 at 13:23
  • Sorry, that was a typo on my part, fixed now. – svick Jan 29 '13 at 13:26
  • Yes, it will read the most current value, but it will read it *before* the switch. At that time, the most current value was 1. – svick Jan 29 '13 at 13:27
  • there is a difference between `total+=a` vs `total=total+a` ? In multi Threaded env... – Royi Namir Jan 29 '13 at 13:33
  • No there isn't any difference between those two. In both cases, it's something like 1. read `total` to a register; 2. add `a` to the register; 3. write `total` back. If some other thread changes the value of `total` between steps 1 and 3, you have a race condition. – svick Jan 29 '13 at 13:37