1

Introduction

A user reported to me this morning that he was having an issue with inconsistent results (namely, column values sometimes coming out null when they should not be) of some parallel execution code that we provide as part of an internal framework. This code has worked fine in the past and has not been tampered with lately, but it got me to thinking about the following snippet:

Code Sample

lock (ResultTable)
{
    newRow = ResultTable.NewRow();
}

newRow["Key"] = currentKey;
foreach (KeyValuePair<string, object> output in outputs)
{
    object resultValue = output.Value;
    newRow[output.Name] = resultValue != null ? resultValue : DBNull.Value;
}

lock (ResultTable)
{
    ResultTable.Rows.Add(newRow);
}

(No guarantees that that compiles, hand-edited to mask proprietery information.)

Explanation

We have this cascading type of locking code other places in our system, and it works fine, but this is the first instance of cascading locking code that I have come across that interacts with ADO .NET. As we all know, members of framework objects are usually not thread safe (which is the case in this situation), but the cascading locking should ensure that we are not reading and writing to ResultTable.Rows concurrently. We are safe, right?

Hypothesis

Well, the cascading lock code does not ensure that we are not reading from or writing to ResultTable.Rows at the same time that we are assigning values to columns in the new row. What if ADO .NET uses some kind of buffer for assigning column values that is not thread safe--even when different object types are involved (DataTable vs. DataRow)?

Has anyone run into anything like this before? I thought I would ask here at StackOverflow before beating my head against this for hours on end :)

Conclusion

Well, the consensus appears to be that changing the cascading lock to a full lock has resolved the issue. That is not the result that I expected, but the full lock version has not produced the issue after many, many, many tests.

The lesson: be wary of cascading locks used on APIs that you do not control. Who knows what may be going on under the covers!

TheVillageIdiot
  • 40,053
  • 20
  • 133
  • 188
  • 1
    Allen...I could use a little more information. Are you using DataSet objects in a disconnected manner ? What is the data store and how are the updates scheduled ? ADO.NET is a many headed beast :) – Rusty May 19 '10 at 20:37
  • Very good question! Yes, this is a DataTable used in a disconnected manner. There is no data store in this case (all in memory). Threads are dispatched via an instance of an internal thread pool that we use all over the place. In this particular scenario, the thread pool is throttled at 8 threads via a Semaphore. Anyway, I hope that answers your question sufficiently! – Allen E. Scharfenberg May 19 '10 at 20:52
  • Most excellent. I am working with some disconnected DataSets at this very moment...I'll try and get in a few tests. Are you running 3.5 or 4.0 ? – Rusty May 19 '10 at 22:28
  • Sorry Allen...Have you been able to reproduce the users problems ? do you trust the user ? – Rusty May 19 '10 at 22:42
  • Hey, Rusty, thanks for giving this a try! We are running .NET 3.5 currently. In terms of the user, I have inspected the user's configuration myself, so, in this case, I do trust that the user's issue is legitimate. It may, of course, not be related to the code snippit in question. I am having the user run a test with a test build of our internal framework that locks around that whole snippit instead of using cascading locks. I will let you know what we find out! =D – Allen E. Scharfenberg May 20 '10 at 13:13

4 Answers4

3

Allen,

I could not find any specific problems with your approach, not that my testing was exhaustive. Here are some ideas that we stick with (all of our applications are thread centric):

Whenever possible:

[1] Make all data access completely atomic. As data sharing in multi-threaded applications is an excellent place for all kinds of unforeseen thread interaction.

[2] Avoid locking on a type. If the type is not know to be thread safe write a wrapper.

[3] Include structures that allow for the fast identification of threads that are accessing a shared resource. If system performance allows, log this information above the debug level and below usual operation log levels.

[4] Any code, including System.* et.al, not explicitly documented internally as Thread Safe Tested is not Thread Safe. Hearsay and the verbal word of others does not count. Test it and write it down.

Hope this is of some value.

Rusty
  • 3,228
  • 19
  • 23
2

I read an article once that said they found the internals use a common row for insert operations in the DataTable. Multiple thread both creating new records will overlay data on the common row and corupt each other causeing the problem. The fix is to lock the table when adding rows so only one thread can add a new row at a time.

Tony
  • 21
  • 2
1

This bit of .NET may have changed in the past seven (!) years but, to answer the question, the hypothesis of column value buffering is incorrect as of .NET 4.7.1. From a look at the source in corefx/DataRow.cs, the issue is a race condition around the _tempRecord field, which stores the row's position in the data table. This field can potentially be modified by any write triggering a call to BeginEditInternal(), which includes value updates. When two writes collide, one can end up following the value of _tempRecord set by the other and therefore updates a different row than expected. This is consistent with Microsoft's documentation stating any write must be synchronized (emphasis added). Tony's earlier answer describes a subset of this behaviour.

As an example, I recently broke code following the locking approach shown in the code sample above by making performance improvements. The code was stable and ran without issue for 1.5 years but, somewhere above 2000 new rows per second, at least one out of a few tens of thousands of writes consistently ends up on a wrong row.

One possible fix is to lock on every write but group them to limit performance impact by minimizing the number of locks. Another is to give each thread its own table to update and later merge the results. In my case, the performance critical section had been a candidate to move off DataTable for some time, so got recoded with more scalable data structures.

Todd West
  • 326
  • 1
  • 8
1

Your code looks fine to me too, but I suggest that you use ResultTable.Rows.SyncRoot for locking before adding the newly created row, so that the rest of the ResultTable object is free to be accessed by other processes.

lock (ResultTable.Rows.SyncRoot)
Manthos
  • 21
  • 1