1

I am using a thread party data model which uses it's custom data model. Hierarchy of the data model is as below:
Model
---Tables(type of Table)
-----Rows(type of Row)
-------Cells( type of Cell)

Table has property Rows as like DataTable and I have to access this property in more than tasks. Now I need a row from the table which has a column value to the specified value.

To do this, I have created a method which has lock statement to make it accessible from only one thread once.

public static Row GetRowWithColumnValue(Model model, string tableKey, string indexColumnKey, string indexColumnValue)
{
    Row simObj = null;
    lock (syncRoot)
    {
        SimWrapperFromValueFactory wrapperSimSystem = new SimWrapperFromValueFactory(model, tableKey, indexColumnKey);
        simObj = wrapperSimSystem.GetWrapper(indexColumnValue);
    }
    return simObj;
}

To create the lookup for one of the column in Table, I have create a method which always try to create a copy of the rows to avoid collection modified exception:

Private Function GetTableRows(table As Table) As List(Of Row)
    Dim rowsList As New List(Of Row)(table.Rows)  'Case 1
    'rowsList.AddRange(table.Rows) 'Case 2
    ' Case 3
    'For i As Integer = 0 To table.Rows.Count - 1
    'rowsList.Add(table.Rows.ElementAt(i))
    'Next
    Return rowsList
End Function

but other threads can modify the table(e.g. add, remove rows or update column value in any rows). I am getting below "Collection modified exception":

at System.ThrowHelper.ThrowInvalidOperationException(ExceptionResource resource)
   at System.Collections.Generic.List`1.Enumerator.MoveNextRare()
   at System.Collections.Generic.List`1.InsertRange(Int32 index, IEnumerable`1 collection)

I cannot modify this third party library to concurrent collections and this same Data Model shared between multiple project.

Question: I hunting for the solution that let me allow multiple readers on this collection either it modified in another threads.. Is it possible to Get a copy of the collection without getting exception??

Referenced below SO threads but did not find exact solution:
Lock vs. ToArray for thread safe foreach access of List collection
Can ToArray() throw an exception?
Is returning an IEnumerable<> thread-safe?

Community
  • 1
  • 1
Niranjan Singh
  • 18,017
  • 2
  • 42
  • 75
  • If I understand you correctly you could create a new List with the rows and use that one. `var rows = new List(GetTableRows());` – Domysee Mar 25 '16 at 10:51
  • 1
    You cannot "lock" a collection which has no built-in support for it (via `SyncRoot` or similar). Additionally, even if you succeed in taking a snapshot of the data (by retrying a full copy until eventually you get one without an exception), how can you be sure that another thread hasn't produced a corruption by modifying row/cell values which do not have the concurrency checking that the table itself does. This is a tricky situation. – Kirill Shlenskiy Mar 25 '16 at 10:52
  • @Domysee: I have already tested it with 3 implementations and i am currently doing as you have suggested but it also use GetEnumerator method internally which causing problem. – Niranjan Singh Mar 25 '16 at 11:11
  • @NiranjanKala, what exactly is `table.Rows`? is it a `List(Of Row)`? If so, I recommend using the `CopyTo` method as it bypasses the enumerator and only accesses 2 fields: `_items` and `_count`, each accessed once, so the probability of a race is much lower than anything involving `GetEnumerator`. If you want to be thorough you can also break out reflection and check the `_version` field before and after the operation to ensure that the list was not modified while you were copying. If this sounds sensible to you, let me know and I'll concoct this into an answer. – Kirill Shlenskiy Mar 25 '16 at 11:29
  • Keep in mind of course that this will only solve the problem of races at the table level (i.e. protect you from another thread adding, removing or swapping rows). If rows and cells aren't immutable and can be mutated by another thread, you are probably still out of luck as you'll never get a snapshot data consistency guarantee. – Kirill Shlenskiy Mar 25 '16 at 11:31
  • @KirillShlenskiy: it implements IEnumerable as `public abstract class TableRowCollection : IEnumerable` and Rows is the property of type TableRowCollection. I am trying to convert it List(Of Row) but it does not inherit ICollection so List class call GetEnumerator method. If somehow it is possible to avoid this iteration then possible to read it different threads. if row add or remove then it does not create problem with data. data is not shared between another threads i am taking care of that – Niranjan Singh Mar 25 '16 at 12:12
  • @NiranjanKala, based on the exception your `TableRowCollection` is backed by a `List`, but not having access to it means using reflection to get that out too. Additionally, I've just noticed your sentence re "other threads can modify the table(e.g. add, remove rows or **update column value in any rows**)" - this is game over. You'll never get a reliable snapshot of the data. – Kirill Shlenskiy Mar 25 '16 at 12:23
  • I am concerned about about the identity cell value and that will not be modified in any thread. My concern is all about the invalid operation exception.. – Niranjan Singh Mar 25 '16 at 13:48

1 Answers1

1

The simplest solution is to retry on exception, like this:

private List<Row> CopyVolatileList(IEnumerable<Row> original)
{
    while (true)
    {
        try
        {
            List<Row> copy = new List<Row>();

            foreach (Row row in original) {
                copy.Add(row);
            }

            // Validate.
            if (copy.Count != 0 && copy[copy.Count - 1] == null) // Assuming Row is a reference type.
            {
                // At least one element was removed from the list while were copying.
                continue;
            }

            return copy;
        }
        catch (InvalidOperationException)
        {
            // Check ex.Message?
        }

        // Keep trying.
    }
}

Eventually you'll get a run where the exception isn't thrown and the data integrity validation passes.

Alternatively, you can dive deep (and I mean very, very deep).

DISCLAIMER: Never ever use this in production. Unless you're desperate and really have no other option.

So we've established that you're working with a custom collection (TableRowCollection) which ultimately uses List<Row>.Enumerator to iterate through the rows. This strongly suggests that your collection is backed by a List<Row>.

First things first, you need to get a reference to that list. Your collection will not expose it publicly, so you'll need to fiddle a bit. You will need to use Reflection to find and get the value of the backing list. I recommend looking at your TableRowCollection in the debugger. It will show you non-public members and you will know what to reflect.

If you can't find your List<Row>, then take a closer look at TableRowCollection.GetEnumerator() - specifically GetEnumerator().GetType(). If that returns List<Row>.Enumerator, then bingo: we can get the backing list out of it, like so:

List<Row> list;

using (IEnumerator<Row> enumerator = table.GetEnumerator())
{
    list = (List<Row>)typeof(List<Row>.Enumerator)
        .GetField("list", BindingFlags.Instance | BindingFlags.NonPublic)
        .GetValue(enumerator);
}

If the above methods of getting your List<Row> have failed, there is no need to read further. You might as well give up.

In case you've succeeded, now that you have the backing List<Row>, we'll have to look at Reference Source for List<T>.

What we see is 3 fields being used:

private T[] _items;
private int _size; // Accessible via "Count".
private int _version;

Our goal is to copy the items whose indexes are between zero and _size - 1 from the _items array into a new array, and to do so in between _version changes.

Observations re thread safety: List<T> does not use locks, none of the fields are marked as volatile and _version is incremented via ++, not Interlocked.Increment. Long story short this means that it is impossible to read all 3 field values and confidently say that we're looking at stable data. We'll have to read the field values repeatedly in order to be somewhat confident that we're looking at a reasonable snapshot (we will never be 100% confident, but you might choose to settle for "good enough").

using System;
using System.Collections.Generic;
using System.Linq.Expressions;
using System.Reflection;
using System.Threading;

private Row[] CopyVolatileList(List<Row> original)
{
    while (true)
    {
        // Get _items and _size values which are safe to use in tandem.
        int version = GetVersion(original); // _version.
        Row[] items = GetItems(original); // _items.
        int count = original.Count; // _size.

        if (items.Length < count)
        {
            // Definitely a torn read. Copy will fail.
            continue;
        }

        // Copy.
        Row[] copy = new Row[count];

        Array.Copy(items, 0, copy, 0, count);

        // Stabilization window.
        Thread.Sleep(1);

        // Validate.
        if (version == GetVersion(original)) {
            return copy;
        }

        // Keep trying.
    }
}

static Func<List<Row>, int> GetVersion = CompilePrivateFieldAccessor<List<Row>, int>("_version");
static Func<List<Row>, Row[]> GetItems = CompilePrivateFieldAccessor<List<Row>, Row[]>("_items");

static Func<TObject, TField> CompilePrivateFieldAccessor<TObject, TField>(string fieldName)
{
    ParameterExpression param = Expression.Parameter(typeof(TObject), "o");
    MemberExpression fieldAccess = Expression.PropertyOrField(param, fieldName);

    return Expression
        .Lambda<Func<TObject, TField>>(fieldAccess, param)
        .Compile();
}

Note re stabilization window: the bigger it is, the more confidence you have that you're not dealing with a torn read (because the list is in process of modifying all 3 fields). I've settled on the smallest value I couldn't fail in my tests where I called CopyVolatileList in a tight loop on one thread, and used another thread to add items to the list, remove them or clear the list at random intervals between 0 and 20ms.

If you remove the stabilization window, you will occasionally get a copy with uninitialized elements at the end of the array because the other thread has removed a row while you were copying - that's why it's needed.

You should obviously validate the copy once it's built, to the best of your ability (at least check for uninitialized elements at the end of the array in case the stabilization window fails).

Good luck.

Kirill Shlenskiy
  • 9,367
  • 27
  • 39