4

I am currently using quite heavily some List and I am looping very frequently via foreach over these lists.
Originally List was immuteable afer the startup. Now I have a requirement to amend the List during runtime from one thread only (a kind of listener). I need to remove from the List in object A and add to the list of object B. A and B are instances of the same class.
Unfortunaly there is no Synchronized List. What would you suggest me to do in this case? in my case speed is more important than synchronisation, thus I am currently working with copies of the lists for add/remove to avoid that the enumerators fail.
Do you have any other recommended way to deal with this?

class X {
    List<T> Related {get; set;}
}

In several places and in different threads I am then using

 foreach var x in X.Related

Now I need to basically perform in yet another thread

a.Related.Remove(t);
b.Related.Add(t);

To avoid potential exceptions, I am currently doing this:

List<T> aNew=new List<T> (a.Related);
aNew.Remove(t);
a.Related=aNew;
List<T>bNew=new List<T>(b.Related){t};
b.Related=bNew;

Is this correct to avoid exceptions?

weismat
  • 7,195
  • 3
  • 43
  • 58
  • Related: http://stackoverflow.com/questions/186527/how-to-synchronize-the-access-to-a-listt-used-in-asp-net – FishBasketGordo Jul 28 '11 at 13:21
  • 4
    I don't think you're avoiding any problems with the code posted at the bottom of the question. You might avoid *exceptions* as you say, but you will certainly have problems. What if two threads do this? Both will grab a copy of the same list, they'll each remove their entry from their list, then assign their list back. Last one wins, the other one lost his change. – Lasse V. Karlsen Jul 28 '11 at 13:25
  • Lasse, you are right, but at this stage I am guarenteed to have only one thread which will perform the changes. – weismat Jul 28 '11 at 13:45
  • How big is the list? If small, stick to the assign of fresh list (aka copy) and only lock the assignment (if even needed). BUT (big one) do NOT mutate the list, just create copies with the changes. LINQ is probably a good candidate for this. – leppie Jul 28 '11 at 14:57
  • Btw: 'Unfortunaly there is no Synchronized List.' is a lie! You have `ArrayList.Synchronized(IList)` which do that exactly (except that you loose the generics along the way). – leppie Jul 28 '11 at 15:01
  • Leppie, I agree with your statement except for the accusation of a lie. Your statement is a a lie except that you are showing why my statement is no lie. – weismat Jul 28 '11 at 15:12

3 Answers3

3

From this MSDN post: http://msdn.microsoft.com/en-us/library/6sh2ey19.aspx

"...the only way to ensure thread safety is to lock the collection during the entire enumeration. "

Steve Wellens
  • 20,506
  • 2
  • 28
  • 69
2

Consider using for loops and iterate over your collection in reverse. This way you do not have the "enumerators fail", and as you are going backwards over your collection it is consistent from the POV of the loop.

It's hard to discuss the threading aspects as there is limited detail.

Update

If your collections are small, and you only have 3-4 "potential" concurrent users, I would suggest using a plain locking strategy as suggested by @Jalal although you would need to iterate backwards, e.g.

private readonly object _syncObj = new object();

lock (_syncObj)
{
    for (int i = list.Count - 1; i >= 0; i--)
    {
        //remove from the list and add to the second one.
    }
}

You need to protect all accesses to your lists with these lock blocks.

Your current implementation uses the COW (Copy-On-Write) strategy, which can be effective in some scenarios, but your particular implementation suffers from the fact that two or more threads take a copy, make their changes, but then could potentially overwrite the results of other threads.

Update

Further to your question comment, if you are guaranteed to only have one thread updating the collections, then your use of COW is valid, as there is no chance of multiple threads making updates and updates being lost by overwriting by multiple threads. It's a good use of the COW strategy to achieve lock free synchronization.

If you bring other threads in to update the collections, my previous locking comments stand.

My only concern would be that the other "reader" threads may have cached values for the addresses of the original lists, and may not see the new addresses when they are updated. In this case make the list variables volatile.

Update

If you do go for the lock-free strategy there is still one more pitfall, there will still be a gap between setting a.Related and b.Related, in which case your reader threads could be iterating over out-of-date collections e.g. item a could have been removed from list1 but not yet added to list2 - item a will be in neither lists. You could also swap the issue around and add to list2 before removing from list1, in which case item a would be in both lists - duplicates.

If consistency is important you should use locking.

Jalal Said
  • 15,906
  • 7
  • 45
  • 68
Tim Lloyd
  • 37,954
  • 10
  • 100
  • 130
  • I have added my current approach. – weismat Jul 28 '11 at 13:23
  • @weismat 1. Are your collections large, or takes time to iterate over them and apply operations? 2. Are there a lot more reads than writes? – Tim Lloyd Jul 28 '11 at 13:31
  • @weismat 3. How many concurrent users of the collections are there? – Tim Lloyd Jul 28 '11 at 13:38
  • The collections are small, but I have 3-4 threads which use them potentially. The operations are quick, but the access is happening very frequently. – weismat Jul 28 '11 at 13:43
  • Thanks for all your updates - I am aware that I might miss one update - but I do not care. Normally I am not fearing locks and I think that people tend to overavoid locks, but if you have around 50-100 Mio additional calls of a lock during a day then you care, for missing mazybe 5-10 changes to the lists a day. – weismat Jul 28 '11 at 15:27
  • @weismat Agreed, locks are usually a good solution, but for specific situations they can affect performance considerably. – Tim Lloyd Jul 28 '11 at 15:34
  • @weismat @chibacity: No! that is not true at all. for instance at this line `myOldList = theNewCopiedAndThenModifiedList` what happen if a reader thread **at the same time** try reading a value at some index in the collection? Guesses what! The value return maybe a combination of the two variables by reading the half of the first value from the first collection and the other half from the new collection... – Jalal Said Jul 28 '11 at 16:20
  • @Jalal What do you mean by a combination of the two variables? BTW The question explicitly discusses iteration, not accessing by index. I'm guessing you're the downvoter... – Tim Lloyd Jul 28 '11 at 16:24
  • a bitwise combination for instance when using a `List`. another problem here suppose the reader want to check the list count. and at the same time the list changed but after the check like: thread 1 checks the count and if it > 0 to proceed, the check operation is evaluated, but before continuing the thread context switch pause the current thread by switching to other thread. the other thread changed the collection and make it empty. now the thread context switching re-switch to the first thread and it continue but it was rely on the fact the the collection is not empty... – Jalal Said Jul 28 '11 at 16:32
  • @Jalal I'm going to try and guess what you mean by "a combination of the two variables" and reading your answer. You seem to think that the object at the index the reader thread is trying to access could be in an inconsistent state as it is being replaced by the writer thread therefore is could be half read and half written. No, it doesn't work like this. The reader thread has a reference to the original list whilst it is accessing it, overwriting the variable that points to the original list does not actually result in the original list itself being overwritten, that's still there. – Tim Lloyd Jul 28 '11 at 16:33
  • @Jalal Again the question specifically relates to the readers iterating over the lists with foreach loops. I'm sticking to the specifics of the quesition. – Tim Lloyd Jul 28 '11 at 16:34
  • I see, then suppose that while the `readers iterating over the lists with foreach loops` the writer thread changes that collection! – Jalal Said Jul 28 '11 at 16:39
  • @Jalal You're missing the whole point of COW - the writer thread is not mutating the list that the reader is iterating over. Furthermore, when the reader begins iterating over the collection, it takes a reference to the collection at the start, overwriting this reference has no effect whilst iteration is taking place as the enumerator is resolved at the beginning. You can change the list reference all day long, but whilst iteration in taking place it will be with the enumerator from the **original** list. – Tim Lloyd Jul 28 '11 at 16:42
  • Yes but he at the writer method could update some field for a variable within the collection, so reading that field of that variable that returns from the iteration is not consistent... Also note that my attention here because of @weismat stated at a comment that `I have only one thread which will potentially change the list, but several workers which are reading only`, so the other threads are reading not only iterating through the list. – Jalal Said Jul 28 '11 at 16:58
  • @Jalal You are missing the point here. The actual lists that the readers are iterating over are **not** being changed. When a reader starts iterating over a list pointed to by variable x, `GetEnumerator` is called on the list which x point to, lets say list1. The reader thread will therefore get an enumerator that points to list1 and it will begin iterating over list1. If at some point a writer copies list1, then makes a new list2, and sets variable x to list2, this has no affect on the currently iterating reader thread. It is using the enumerator that points to list1. – Tim Lloyd Jul 28 '11 at 17:05
  • It is true that the list is copied, but the list itself containing another items, that item are not copied. it will only copied when using immutable types. but when using reference types that is a different story. both lists will have the same item `X` and there field or property `Y`, so if the writer thread change that variable "Y", while the reader thread reading it. definitely that is a problem. – Jalal Said Jul 28 '11 at 17:13
  • @Jalal The original question is about iterating, adding and removing items from lists. It is not about mutating individual items within the collection. I and the OP are sticking to the original question. You seem very keen to find fault in my answer by bringing in unrelated topics? Your downvote says it all I guess. – Tim Lloyd Jul 28 '11 at 17:16
  • No, believe me that is not the case, `You seem very keen to find fault in my answer by bringing in unrelated topic`, why should I. I am only encourage you to update your answer to indicates some of the facts. that this strategy is only if he want to just innumerate through the collection and nothing more, any other usage will fail.., of course in any ordinary application the scenario is not just enumerating in the readers, after all they have to read something :) – Jalal Said Jul 28 '11 at 17:23
  • @Jalal The question is not only about enumerating collections, it is about mutating them as well. I do not cover mutating individual objects as this is not what the question is about. The fact that immutable is stated so many times might be a clue. I have spent most of the comments explaining to you basics of how references work in a round-about-fashion. Should I explain this in my answer too :( – Tim Lloyd Jul 28 '11 at 17:31
  • It was such a nicely conversation and I think we both learn something from each other. – Jalal Said Jul 28 '11 at 17:35
0

You should lock before you handle the lists since you are in multithreading mode, the lock operation itself does not affect the speed here, the lock operation is executed in nanoseconds about 10 ns depending on the machine. So:

private readonly object _listLocker = new object();

lock (_listLocker)
{
    for (int itemIndex = 0; itemIndex < list.Count; itemIndex++)
    {
        //remove from the first list and add to the second one.
    }
}

If you are using framework 4.0 I encorage you to use ConcurrentBag instead of list.

Edit: code snippet:

List<T> aNew=new List<T> (a.Related);

This will work if only all interaction with the collection "including add remove replace items" managed this way. Also you have to use System.Threading.Interlocked.CompareExchange and System.Threading.Interlocked.Exchange methods to replace the existing collection with the new modified. if that is not the case then you are doing nothing by coping

This will not work. for instance consider a thread trying to get an item from the collection, at the same time another thread replace the collection. this could leave the item retrieved in a not constant data. also consider while you are coping the collection, another thread want to insert item to the collection at the same time while you are coping?
This will throw exception indicates that the collection modified.

Another thing is that you are coping the whole collection to a new list to handle it. certainly this will harm the performance, and I think using synchronization mechanism such as lock reduce the performance pallet, and it is the much appropriated thing to do while to handle multithreading scenarios.

Jalal Said
  • 15,906
  • 7
  • 45
  • 68
  • 1
    Although the lock operation is quick, it depends on how much work is done inside the lock as to whether it affects performance. If a lot of work is being done, then there will be lock contention and other threads may get descheduled. – Tim Lloyd Jul 28 '11 at 13:29
  • How should I exactly use ConcurrentBag? I see an add, but not a remove. – weismat Jul 28 '11 at 13:42
  • @weismat: use `bag.TryTake` see [msdn](http://msdn.microsoft.com/en-us/library/dd381774.aspx). – Jalal Said Jul 28 '11 at 14:02
  • I have only one thread which will potentially change the list, but several workers which are reading only. – weismat Jul 28 '11 at 14:11
  • TryTake will take any object from the bag...The parameter is out so this is where the object will be copied to. – weismat Jul 28 '11 at 14:12
  • @weismat: then you should either use `lock` or the `ConcurrentBag`, because assume while some innocent reader thread try to retreave some item, but the collection just modified by the other "writer" thread!, it definitely cause troubles. but when you use either of `lock` or the `concurrent collctions` you are safe from such a scenarios. – Jalal Said Jul 28 '11 at 14:17
  • @weismat: The `TryTake` method will try to retrieve an item. if it fails it will return false, otherwise it will return true. it is under the hood uses some synchronization mechanism with lock free code. that is it, it will not block and it does not consume cpu resources.. – Jalal Said Jul 28 '11 at 14:24