222

I was thrilled to see the new System.Collections.Concurrent namespace in .Net 4.0, quite nice! I've seen ConcurrentDictionary, ConcurrentQueue, ConcurrentStack, ConcurrentBag and BlockingCollection.

One thing that seems to be mysteriously missing is a ConcurrentList<T>. Do I have to write that myself (or get it off the web :) )?

Am I missing something obvious here?

Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
Alan
  • 2,574
  • 2
  • 18
  • 16

12 Answers12

182

I gave it a try a while back (also: on GitHub). My implementation had some problems, which I won't get into here. Let me tell you, more importantly, what I learned.

Firstly, there's no way you're going to get a full implementation of IList<T> that is lockless and thread-safe. In particular, random insertions and removals are not going to work, unless you also forget about O(1) random access (i.e., unless you "cheat" and just use some sort of linked list and let the indexing suck).

What I thought might be worthwhile was a thread-safe, limited subset of IList<T>: in particular, one that would allow an Add and provide random read-only access by index (but no Insert, RemoveAt, etc., and also no random write access).

This was the goal of my ConcurrentList<T> implementation. But when I tested its performance in multithreaded scenarios, I found that simply synchronizing adds to a List<T> was faster. Basically, adding to a List<T> is lightning fast already; the complexity of the computational steps involved is miniscule (increment an index and assign to an element in an array; that's really it). You would need a ton of concurrent writes to see any sort of lock contention on this; and even then, the average performance of each write would still beat out the more expensive albeit lockless implementation in ConcurrentList<T>.

In the relatively rare event that the list's internal array needs to resize itself, you do pay a small cost. So ultimately I concluded that this was the one niche scenario where an add-only ConcurrentList<T> collection type would make sense: when you want guaranteed low overhead of adding an element on every single call (so, as opposed to an amortized performance goal).

It's simply not nearly as useful a class as you would think.

Dan Tao
  • 125,917
  • 54
  • 300
  • 447
  • 59
    And if you need something similar to `List` that uses old-skool, monitor-based synchronisation, there's `SynchronizedCollection` hidden away in the BCL: http://msdn.microsoft.com/en-us/library/ms668265.aspx – LukeH Jul 06 '11 at 19:43
  • 9
    One small addition: use the Capacity constructor parameter to avoid (as much as possible) the resizing scenario. – H H Jul 06 '11 at 21:42
  • 1
    One addition I'd like to add. You can avoid the resizing problem by making a linked-list of "chunks". Of course, this adds a lot of complexity, but allows for relatively close to O(1) lookups and additions, and can also make it so you don't have to lock the entire structure upon an `Insert` or `RemoveAt`.. But again, it's massively more complex to implement and will never meet the performance of List – Earlz Dec 11 '12 at 02:55
  • 2
    The biggest scenario where a `ConcurrentList` would be a win would be when there isn't a whole lot of activity adding to the list, but there are many concurrent readers. One could reduce the readers' overhead to a single memory-barrier (and eliminate even that if readers weren't concerned about slightly-stale data). – supercat Feb 19 '13 at 19:33
  • 1
    @supercat You could synchronize for that scenario using a ReaderWriterLock http://msdn.microsoft.com/en-us/library/system.threading.readerwriterlock(v=vs.110).aspx – Kevin Doyon Dec 11 '13 at 15:23
  • 2
    @Kevin: It's pretty trivial to construct a `ConcurrentList` in such a fashion that readers are guaranteed to see consistent state without needing any locking whatsoever, with relatively slight added overhead. When the list expands from e.g. size 32 to 64, keep the size-32 array and create a new size-64 array. When adding each of the next 32 items, put it in slot a 32-63 of the new array and copy an old item from the size-32 array to the new one. Until the 64th item is added, readers will look in the size-32 array for items 0-31 and in the size-64 array for items 32-63. – supercat Dec 11 '13 at 16:23
  • 2
    Once the 64th item is added, the size-32 array will still work for fetching items 0-31, but readers will no longer need to use it. They can use the size-64 array for all items 0-63, and a size-128 array for items 64-127. The overhead of selecting which one of two arrays to use, plus a memory barrier if desired, would be less than the overhead of even the most efficient reader-writer lock imaginable. Writes should probably use locks (lock-free would be possible, especially if one didn't mind creating a new object instance with every insertion, but the lock should be cheap. – supercat Dec 11 '13 at 16:25
  • @supercat Re: `ConcurrentList would be a win would be when there isn't a whole lot of activity adding to the list, but there are many concurrent readers` and `especially if one didn't mind creating a new object instance with every insertion`. Exactly for use cases like this, which I've had to deal with increasingly often recently, the `CopyOnWriteSwapper` in my answer is a generic solution that performs locklessly fast for all reads and iterations, and allows the use of arbitrary data structure, i.e. you can keep your O(1) index lookup. – Evgeniy Berezovsky Jul 21 '15 at 06:09
  • @EugeneBeresovsky: The `CopyOnWriteSwapper` implementations I've seen generally have O(N^2) behavior for building. The `ConcurrentList` I'd have liked to have seen would have retained amortized constant `Add` time. Building it might not have been as fast as building a synchronized list, but reads would avoid synchronization overhead. – supercat Jul 21 '15 at 13:49
  • @supercat The one I provide [below](http://stackoverflow.com/a/30769838/709537) is generic in that sense. The cost in my example `new List(existing)` would be O(n). Not as bad as O(n^2), but bad enough if the list is huge, or you modify it often. If you need lockless read access AND something like O(log n) performance for additions, I guess [Immutable](https://msdn.microsoft.com/en-us/library/dn385366%28v=vs.110%29.aspx) it has to be. – Evgeniy Berezovsky Jul 21 '15 at 22:45
  • @EugeneBeresovsky: IMHO, .NET would have benefited greatly from "add-only" list and dictionary types, since such types only need can easily be written to allow lock-free read access and O(1) write access; further, dictionaries of such type can cheaply *guarantee* that items will be enumerated in the order they were added (if an enumeration is started just as items are added, behavior will be consistent with the items having been added either before or after the enumeration). If there's no need to actually remove items from a collection, types that don't need to support removal could... – supercat Jul 22 '15 at 15:15
  • ...easily provide useful guarantees (always-consistent state for reading, and in-order enumeration) which cannot be readily satisfied in collections that allow deletions. – supercat Jul 22 '15 at 15:16
  • @supercat for what it's worth, ConcurrentDictionary is now lock free for reads, performs essentially the same as a normal dictionary for lookups and allows safe enumeration while being modified. – Mike Marynowski Sep 06 '19 at 21:17
  • @MikeMarynowski: I don't think `ConcurrentDictionary` offers any particular guarantees about enumeration order. Otherwise, though, it seems like a fine class. Add-only lists and dictionaries should each include a member not present in current interfaces; `AddAndGetIndex` would add an item and return its insertion order, and for dictionaries there should also be a `GetItemIndex` to yield an insertion order by key. An immutable wrapper on an add-only dictionary with such a function could simply wrap the original dictionary along with a count of how many items were in it when wrapped... – supercat Sep 07 '19 at 17:38
  • ...and treat any items added after that as though they don't exist. – supercat Sep 07 '19 at 17:38
  • @supercat That is correct, though I think what is there is sufficient for most uses of a concurrent dictionary, even in read only scenarios. Just curious, what usage scenario would require you to guarantee an enumeration order on a dictionary? If you want an add-only dictionary with "snapshot" enumeration behavior and 100% dictionary read performance I made an implementation of that here: http://www.singulink.com/CodeIndex/post/fastest-thread-safe-lock-free-dictionary – Mike Marynowski Sep 13 '19 at 15:08
  • @MikeMarynowski: A typical usage scenario would be a live status report of unique items observed thus far. One could use a concurrent dictionary to establish which items are unique and then add them to a concurrent list, but it should be possible to have one data structure serve both purposes. – supercat Sep 13 '19 at 15:27
40

What would you use a ConcurrentList for?

The concept of a Random Access container in a threaded world isn't as useful as it may appear. The statement

  if (i < MyConcurrentList.Count)  
      x = MyConcurrentList[i]; 

as a whole would still not be thread-safe.

Instead of creating a ConcurrentList, try to build solutions with what's there. The most common classes are the ConcurrentBag and especially the BlockingCollection.

H H
  • 263,252
  • 30
  • 330
  • 514
  • Good point. Still what I am doing is a little more mundane. I am just trying to assign the ConcurrentBag into an IList. I could switch my property to an IEnumerable, but then I can't .Add stuff to it. – Alan Jul 06 '11 at 19:12
  • @Henk - I don't follow your argument. Why wouldn't you have the same type of issue with if (!dict.ContainsKey(somekey)) dict[someKey] = someValue. Would there be thread-safety issues with that code as well (e.g. using ConcurrentDictionary)? – dcp Jul 06 '11 at 19:13
  • Java has the concept of a ConcurrentList: http://download.oracle.com/javase/6/docs/api/java/util/concurrent/package-summary.html –  Jul 06 '11 at 19:15
  • 1
    @Alan: There's no way to implement that without locking the list. Since you can already use `Monitor` to do that anyway, there's no reason for a concurrent list. – Billy ONeal Jul 06 '11 at 19:15
  • @dcp: You would use TryGetValue() mostly. – H H Jul 06 '11 at 19:16
  • @0A0D: I don't see one. There's a copy on write version but that's not what most people would expect a "concurrent collection" to do. – Billy ONeal Jul 06 '11 at 19:16
  • @Henk - Ok, but why do they still have the ConcurrentDictionary.ContainsKey method? – dcp Jul 06 '11 at 19:17
  • @Billy the point of a ConcurrentList would be to aleviate us all from the burden of locking/unlocking it. Same reason all of the other concurrent collections exist. – Alan Jul 06 '11 at 19:19
  • 6
    @dcp - yes this inherently non-thread-safe. ConcurrentDictionary has special methods which do this in one atomic operation, like AddOrUpdate, GetOrAdd, TryUpdate, etc. They still have ContainsKey because sometimes you just want to know if the key is there without modifying the dictionary (think HashSet) – Zarat Jul 06 '11 at 19:19
  • @Zarat - yes, but that's my point. It seems weird to have non thread-safe methods in a class whose name starts with the word "Concurrent". But I see what you're saying, thanks for the clarifications and explanation. – dcp Jul 06 '11 at 19:20
  • 3
    @dcp - ContainsKey is threadsafe by itself, your example (not ContainsKey!) just has a race condition because you do a second call depending on the first decision, which may at that point already be out of date. – Zarat Jul 06 '11 at 19:24
  • 2
    Henk, I'm not agree. I think that there is a simple scenario where it could be very usefull. Worker thread write in it will UI thread read and update interface accordingly. If you want to add item in a sorted fashion, it will require random access write. You could also use a stack and a view to the data but you will have to maintain 2 collections :-(. – Eric Ouellet Dec 21 '11 at 17:59
21

With all due respect to the great answers provided already, there are times that I simply want a thread-safe IList. Nothing advanced or fancy. Performance is important in many cases but at times that just isn't a concern. Yes, there are always going to be challenges without methods like "TryGetValue" etc, but most cases I just want something that I can enumerate without needing to worry about putting locks around everything. And yes, somebody can probably find some "bug" in my implementation that might lead to a deadlock or something (I suppose) but lets be honest: When it comes to multi-threading, if you don't write your code correctly, it is going deadlock anyway. With that in mind I decided to make a simple ConcurrentList implementation that provides these basic needs.

And for what its worth: I did a basic test of adding 10,000,000 items to regular List and ConcurrentList and the results were:

List finished in: 7793 milliseconds. Concurrent finished in: 8064 milliseconds.

public class ConcurrentList<T> : IList<T>, IDisposable
{
    #region Fields
    private readonly List<T> _list;
    private readonly ReaderWriterLockSlim _lock;
    #endregion

    #region Constructors
    public ConcurrentList()
    {
        this._lock = new ReaderWriterLockSlim(LockRecursionPolicy.NoRecursion);
        this._list = new List<T>();
    }

    public ConcurrentList(int capacity)
    {
        this._lock = new ReaderWriterLockSlim(LockRecursionPolicy.NoRecursion);
        this._list = new List<T>(capacity);
    }

    public ConcurrentList(IEnumerable<T> items)
    {
        this._lock = new ReaderWriterLockSlim(LockRecursionPolicy.NoRecursion);
        this._list = new List<T>(items);
    }
    #endregion

    #region Methods
    public void Add(T item)
    {
        try
        {
            this._lock.EnterWriteLock();
            this._list.Add(item);
        }
        finally
        {
            this._lock.ExitWriteLock();
        }
    }

    public void Insert(int index, T item)
    {
        try
        {
            this._lock.EnterWriteLock();
            this._list.Insert(index, item);
        }
        finally
        {
            this._lock.ExitWriteLock();
        }
    }

    public bool Remove(T item)
    {
        try
        {
            this._lock.EnterWriteLock();
            return this._list.Remove(item);
        }
        finally
        {
            this._lock.ExitWriteLock();
        }
    }

    public void RemoveAt(int index)
    {
        try
        {
            this._lock.EnterWriteLock();
            this._list.RemoveAt(index);
        }
        finally
        {
            this._lock.ExitWriteLock();
        }
    }

    public int IndexOf(T item)
    {
        try
        {
            this._lock.EnterReadLock();
            return this._list.IndexOf(item);
        }
        finally
        {
            this._lock.ExitReadLock();
        }
    }

    public void Clear()
    {
        try
        {
            this._lock.EnterWriteLock();
            this._list.Clear();
        }
        finally
        {
            this._lock.ExitWriteLock();
        }
    }

    public bool Contains(T item)
    {
        try
        {
            this._lock.EnterReadLock();
            return this._list.Contains(item);
        }
        finally
        {
            this._lock.ExitReadLock();
        }
    }

    public void CopyTo(T[] array, int arrayIndex)
    {
        try
        {
            this._lock.EnterReadLock();
            this._list.CopyTo(array, arrayIndex);
        }
        finally
        {
            this._lock.ExitReadLock();
        }
    }

    public IEnumerator<T> GetEnumerator()
    {
        return new ConcurrentEnumerator<T>(this._list, this._lock);
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return new ConcurrentEnumerator<T>(this._list, this._lock);
    }

    ~ConcurrentList()
    {
        this.Dispose(false);
    }

    public void Dispose()
    {
        this.Dispose(true);
    }

    private void Dispose(bool disposing)
    {
        if (disposing)
            GC.SuppressFinalize(this);

        this._lock.Dispose();
    }
    #endregion

    #region Properties
    public T this[int index]
    {
        get
        {
            try
            {
                this._lock.EnterReadLock();
                return this._list[index];
            }
            finally
            {
                this._lock.ExitReadLock();
            }
        }
        set
        {
            try
            {
                this._lock.EnterWriteLock();
                this._list[index] = value;
            }
            finally
            {
                this._lock.ExitWriteLock();
            }
        }
    }

    public int Count
    {
        get
        {
            try
            {
                this._lock.EnterReadLock();
                return this._list.Count;
            }
            finally
            {
                this._lock.ExitReadLock();
            }
        }
    }

    public bool IsReadOnly
    {
        get { return false; }
    }
    #endregion
}

    public class ConcurrentEnumerator<T> : IEnumerator<T>
{
    #region Fields
    private readonly IEnumerator<T> _inner;
    private readonly ReaderWriterLockSlim _lock;
    #endregion

    #region Constructor
    public ConcurrentEnumerator(IEnumerable<T> inner, ReaderWriterLockSlim @lock)
    {
        this._lock = @lock;
        this._lock.EnterReadLock();
        this._inner = inner.GetEnumerator();
    }
    #endregion

    #region Methods
    public bool MoveNext()
    {
        return _inner.MoveNext();
    }

    public void Reset()
    {
        _inner.Reset();
    }

    public void Dispose()
    {
        this._lock.ExitReadLock();
    }
    #endregion

    #region Properties
    public T Current
    {
        get { return _inner.Current; }
    }

    object IEnumerator.Current
    {
        get { return _inner.Current; }
    }
    #endregion
}
Brian Booth
  • 727
  • 1
  • 6
  • 10
  • 8
    OK, old answer but still: `RemoveAt(int index)` is never thread-safe, `Insert(int index, T item)` is only safe for index==0, the return of `IndexOf()` is immediately outdated etc. Don't even start about the `this[int]`. – H H Jul 13 '15 at 13:49
  • 3
    And you don't need and don't want a ~Finalizer(). – H H Jul 13 '15 at 13:49
  • 2
    You say you have given up on preventing the possibility of deadlock - and a single `ReaderWriterLockSlim` can be made to deadlock [easily](http://stackoverflow.com/a/26578074/709537) by using `EnterUpgradeableReadLock()` concurrently. You don't however use it, you don't make the lock accessible to the outside, and you don't e.g. call a method that enters a write lock while holding a read lock, so using your class does not make deadlocks any more likely. – Evgeniy Berezovsky Jul 15 '15 at 00:19
  • 1
    The non-concurrent interface is not appropriate for concurrent access. E.g. the following is not atomic `var l = new ConcurrentList(); /* ... */ l[0] += "asdf";`. In general, any read-write combo can lead you into deep trouble when done concurrently. That is why concurrent data structures generally provide methods for those, like `ConcurrentDictionary`'s `AddOrGet` etc. N.B. Your constant (and redundant because the members are already marked as such by the underscore) repetition of `this.` clutters. – Evgeniy Berezovsky Jul 15 '15 at 00:20
  • 1
    Thanks Eugene. I am a heavy user of .NET Reflector which puts "this." on all non-static fields. As such, I have grown to prefer the same. Regarding this non-concurrent interface not being appropriate: You are absolutely right that trying to perform multiple actions against my implementation can become unreliable. But the requirement here is simply that single actions (add, or remove, or clear, or enumeration) can be done without corrupting the collection. It basically removes the need to put lock statements around everything. – Brian Booth Sep 30 '15 at 13:22
19

The reason why there is no ConcurrentList is because it fundamentally cannot be written. The reason why is that several important operations in IList rely on indices, and that just plain won't work. For example:

int catIndex = list.IndexOf("cat");
list.Insert(catIndex, "dog");

The effect that the author is going after is to insert "dog" before "cat", but in a multithreaded environment, anything can happen to the list between those two lines of code. For example, another thread might do list.RemoveAt(0), shifting the entire list to the left, but crucially, catIndex will not change. The impact here is that the Insert operation will actually put the "dog" after the cat, not before it.

The several implementations that you see offered as "answers" to this question are well-meaning, but as the above shows, they don't offer reliable results. If you really want list-like semantics in a multithreaded environment, you can't get there by putting locks inside the list implementation methods. You have to ensure that any index you use lives entirely inside the context of the lock. The upshot is that you can use a List in a multithreaded environment with the right locking, but the list itself cannot be made to exist in that world.

If you think you need a concurrent list, there are really just two possibilities:

  1. What you really need is a ConcurrentBag
  2. You need to create your own collection, perhaps implemented with a List and your own concurrency control.

If you have a ConcurrentBag and are in a position where you need to pass it as an IList, then you have a problem, because the method you're calling has specified that they might try to do something like I did above with the cat & dog. In most worlds, what that means is that the method you're calling is simply not built to work in a multi-threaded environment. That means you either refactor it so that it is or, if you can't, you're going to have to handle it very carefully. You you'll almost certainly be required to create your own collection with its own locks, and call the offending method within a lock.

Steve Benz
  • 298
  • 2
  • 8
12

ConcurrentList (as a resizeable array, not a linked list) is not easy to write with nonblocking operations. Its API doesn't translate well to a "concurrent" version.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
6

In cases where reads greatly outnumber writes, or (however frequent) writes are non-concurrent, a copy-on-write approach may be appropriate.

The implementation shown below is

  • lockless
  • blazingly fast for concurrent reads, even while concurrent modifications are ongoing - no matter how long they take
  • because "snapshots" are immutable, lockless atomicity is possible, i.e. var snap = _list; snap[snap.Count - 1]; will never (well, except for an empty list of course) throw, and you also get thread-safe enumeration with snapshot semantics for free.. how I LOVE immutability!
  • implemented generically, applicable to any data structure and any type of modification
  • dead simple, i.e. easy to test, debug, verify by reading the code
  • usable in .Net 3.5

For copy-on-write to work, you have to keep your data structures effectively immutable, i.e. no one is allowed to change them after you made them available to other threads. When you want to modify, you

  1. clone the structure
  2. make modifications on the clone
  3. atomically swap in the reference to the modified clone

Code

static class CopyOnWriteSwapper
{
    public static void Swap<T>(ref T obj, Func<T, T> cloner, Action<T> op)
        where T : class
    {
        while (true)
        {
            var objBefore = Volatile.Read(ref obj);
            var newObj = cloner(objBefore);
            op(newObj);
            if (Interlocked.CompareExchange(ref obj, newObj, objBefore) == objBefore)
                return;
        }
    }
}

Usage

CopyOnWriteSwapper.Swap(ref _myList,
    orig => new List<string>(orig),
    clone => clone.Add("asdf"));

If you need more performance, it will help to ungenerify the method, e.g. create one method for every type of modification (Add, Remove, ...) you want, and hard code the function pointers cloner and op.

N.B. #1 It is your responsibility to make sure the no one modifies the (supposedly) immutable data structure. There's nothing we can do in a generic implementation to prevent that, but when specializing to List<T>, you could guard against modification using List.AsReadOnly()

N.B. #2 Be careful about the values in the list. The copy on write approach above guards their list membership only, but if you'd put not strings, but some other mutable objects in there, you have to take care of thread safety (e.g. locking). But that is orthogonal to this solution and e.g. locking of the mutable values can be easily used without issues. You just need to be aware of it.

N.B. #3 If your data structure is huge and you modify it frequently, the copy-all-on-write approach might be prohibitive both in terms of memory consumption and the CPU cost of copying involved. In that case, you might want to use MS's Immutable Collections instead.

Evgeniy Berezovsky
  • 18,571
  • 13
  • 82
  • 156
5

System.Collections.Generic.List<t> is already thread safe for multiple readers. Trying to make it thread safe for multiple writers wouldn't make sense. (For reasons Henk and Stephen already mentioned)

Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
  • You cannot see a scenario where I might have 5 threads adding to a list? This way you could see the list accumulate records even before they all terminate. – Alan Jul 06 '11 at 19:21
  • 10
    @Alan - that would be a ConcurrentQueue, ConcurrentStack or ConcurrentBag. To make sense of a ConcurrentList you should provide a use-case where the available classes are not enough. I don't see why I would want indexed access when the elements at the indices can randomly change through concurrent removes. And for a "locked" read you can already take snapshots of the existing concurrent classes and put them in a list. – Zarat Jul 06 '11 at 19:33
  • You're right -- I don't want indexed access. I generally use IList as a proxy for an IEnumerable to which I can .Add(T) new elements. That's where the question comes from, really. – Alan Jul 07 '11 at 17:50
  • @Alan: Then you want a queue, not a list. – Billy ONeal Jul 08 '11 at 11:44
  • 3
    I think you are wrong. Saying: safe for multiple readers does not mean that you can't write at the same time. Write would also mean delete and you will get an error if you delete while iterating in it. – Eric Ouellet Dec 21 '11 at 17:52
  • @Eric: I never said you could write at the same time. You can either have one writer. Or you can have multiple readers. Not both. – Billy ONeal Dec 21 '11 at 18:56
  • @Eugene: I never said that it was a "concurrent list". There is no such thing as "thread safe". Thread safety requires understanding the actual thread safety characteristics of a structure. Not every data structure is safe for multiple readers and no writers because many data structures have global shared state they don't tell you about. – Billy ONeal Jul 15 '15 at 00:12
  • @Eugene: Which is exactly what my answer says. I understand that the question asked for it; but what the question asks for fundamentally doesn't make any sense. – Billy ONeal Jul 15 '15 at 00:46
2

Some people hilighted some goods points (and some of my thoughts):

  • It could looklikes insane to unable random accesser (indexer) but to me it appears fine. You only have to think that there is many methods on multi-threaded collections that could fail like Indexer and Delete. You could also define failure (fallback) action for write accessor like "fail" or simply "add at the end".
  • It is not because it is a multithreaded collection that it will always be used in a multithreaded context. Or it could also be used by only one writer and one reader.
  • Another way to be able to use indexer in a safe manner could be to wrap actions into a lock of the collection using its root (if made public).
  • For many people, making a rootLock visible goes agaist "Good practice". I'm not 100% sure about this point because if it is hidden you remove a lot of flexibility to the user. We always have to remember that programming multithread is not for anybody. We can't prevent every kind of wrong usage.
  • Microsoft will have to do some work and define some new standard to introduce proper usage of Multithreaded collection. First the IEnumerator should not have a moveNext but should have a GetNext that return true or false and get an out paramter of type T (this way the iteration would not be blocking anymore). Also, Microsoft already use "using" internally in the foreach but sometimes use the IEnumerator directly without wrapping it with "using" (a bug in collection view and probably at more places) - Wrapping usage of IEnumerator is a recommended pratice by Microsoft. This bug remove good potential for safe iterator... Iterator that lock collection in constructor and unlock on its Dispose method - for a blocking foreach method.

That is not an answer. This is only comments that do not really fit to a specific place.

... My conclusion, Microsoft has to make some deep changes to the "foreach" to make MultiThreaded collection easier to use. Also it has to follow there own rules of IEnumerator usage. Until that, we can write a MultiThreadList easily that would use a blocking iterator but that will not follow "IList". Instead, you will have to define own "IListPersonnal" interface that could fail on "insert", "remove" and random accessor (indexer) without exception. But who will want to use it if it is not standard ?

Eric Ouellet
  • 10,996
  • 11
  • 84
  • 119
  • One could easily write a `ConcurrentOrderedBag` which would include a read-only implementation of `IList`, but would also offer a fully-thread-safe `int Add(T value)` method. I don't see why any `ForEach` changes would be needed. Although Microsoft doesn't explicitly say so, their practice suggests that it's perfectly acceptable for `IEnumerator` to enumerate the collection contents that existed when it was created; the collection-modified exception is only required if the enumerator would be unable to guarantee glitch-free operation. – supercat Dec 11 '13 at 16:48
  • Iterating through a MT collection, the way it is design could lead, as you said, to an exception... Which one I don't know. Would you trap all exceptions? In my own book exception is exception and should not occurs in normal execution of code. Otherwise, to prevent exception, you have to either lock the collection or get a copy (in a safe manner-ie lock) or implement very complex mechanism in the collection to prevent exception to occurs due to concurrency. My though was that it would be nice to add an IEnumeratorMT that would lock the collection while a for each occur and add related code... – Eric Ouellet Dec 11 '13 at 21:08
  • The other thing that could also occur is that when you get an iterator, you can lock the collection and when your iterator is GC collected you can unlock the collection. According to Microsfot they already check if the IEnumerable is also an IDisposable and call the GC if so at the end of a ForEach. The main problem is that they also use the IEnumerable elsewhere without calling the GC, you then can't rely on that. Having a new clear MT interface for IEnumerable enabling lock would do solve the problem, at least a part of it. (It would not prevent people to not call it). – Eric Ouellet Dec 11 '13 at 21:14
  • It is very bad form for a public `GetEnumerator` method to leave a collection locked after it returns; such designs can easily lead to deadlock. The `IEnumerable` provides no indication of whether an enumeration can be expected to complete even if a collection is modified; the best one can do is write one's own methods so that they will do so, and have methods which accept `IEnumerable` document the fact that will only be thread-safe if the `IEnumerable` supports thread-safe enumeration. – supercat Dec 11 '13 at 22:29
  • What would have been most helpful would have been if `IEnumerable` had included a "Snapshot" method with return type `IEnumerable`. Immutable collections could return themselves; a bounded collection could if nothing else copy itself to a `List` or `T[]` and call `GetEnumerator` on that. Some unbounded collections could implement `Snapshot`, and those which couldn't would be able to throw an exception without trying to fill up a list with their contents. – supercat Dec 11 '13 at 22:48
  • I mostly agree. Snapshot would be very nice. And locking on "GetEnumerator" could easily lead to what you said because people don't expect that behavior. I don't say I have the perfect solution, but having an iterator without lock is dangerous and implementation should prevent that for sure. The problem with snapshot is performance for big collection. I think that ideally, MT collection should be implemented in more than on way according to its wanted usage (main usage). With snapshot (like you said) and also with mechanics enabling modification while iterating (which slow down insert/delete). – Eric Ouellet Dec 12 '13 at 14:37
  • Thanks supercat for your comments. I appreciate. – Eric Ouellet Dec 12 '13 at 14:37
  • Not all collections could implement `Snapshot` quickly, and some couldn't implement it at all, but many collections could implement it far more quickly than could outside code whose only exposure to the type was via public interfaces. For example, a snapshot of an `AppendOnlyList` would could encapsulate a reference to the list and the number of items it presently contains; even though the list wouldn't be completely immutable, a list which reports itself as containing N items could guarantee that the first N items are immutable. – supercat Dec 12 '13 at 15:44
  • Right! I like your idea of AppendOnlyList, it would have solved some of my previous needs. – Eric Ouellet Dec 12 '13 at 16:26
  • I think AppendOnlyList and AddOnlyDictionary would have been very nice types, since they can be cheaply implemented in such a way as to allow any number of readers *simultaneous* with a single writer, allow the construction of immutable snapshots, and (in the case of AddOnlyDictionary), promise to enumerate items in the order they were added (guarantees not made by `Dictionary`, `ConcurrentDictionary`, nor `ConcurrentBag`). – supercat Dec 12 '13 at 16:38
  • Totally agreed. I would have use them few times in the course of my projects. I will consider write one next time I will need it. Thanks. – Eric Ouellet Dec 17 '13 at 19:51
1

I implemented one similar to Brian's. Mine is different:

  • I manage the array directly.
  • I don't enter the locks within the try block.
  • I use yield return for producing an enumerator.
  • I support lock recursion. This allows reads from list during iteration.
  • I use upgradable read locks where possible.
  • DoSync and GetSync methods allowing sequential interactions that require exclusive access to the list.

The code:

public class ConcurrentList<T> : IList<T>, IDisposable
{
    private ReaderWriterLockSlim _lock = new ReaderWriterLockSlim(LockRecursionPolicy.SupportsRecursion);
    private int _count = 0;

    public int Count
    {
        get
        { 
            _lock.EnterReadLock();
            try
            {           
                return _count;
            }
            finally
            {
                _lock.ExitReadLock();
            }
        }
    }

    public int InternalArrayLength
    { 
        get
        { 
            _lock.EnterReadLock();
            try
            {           
                return _arr.Length;
            }
            finally
            {
                _lock.ExitReadLock();
            }
        }
    }

    private T[] _arr;

    public ConcurrentList(int initialCapacity)
    {
        _arr = new T[initialCapacity];
    }

    public ConcurrentList():this(4)
    { }

    public ConcurrentList(IEnumerable<T> items)
    {
        _arr = items.ToArray();
        _count = _arr.Length;
    }

    public void Add(T item)
    {
        _lock.EnterWriteLock();
        try
        {       
            var newCount = _count + 1;          
            EnsureCapacity(newCount);           
            _arr[_count] = item;
            _count = newCount;                  
        }
        finally
        {
            _lock.ExitWriteLock();
        }       
    }

    public void AddRange(IEnumerable<T> items)
    {
        if (items == null)
            throw new ArgumentNullException("items");

        _lock.EnterWriteLock();

        try
        {           
            var arr = items as T[] ?? items.ToArray();          
            var newCount = _count + arr.Length;
            EnsureCapacity(newCount);           
            Array.Copy(arr, 0, _arr, _count, arr.Length);       
            _count = newCount;
        }
        finally
        {
            _lock.ExitWriteLock();          
        }
    }

    private void EnsureCapacity(int capacity)
    {   
        if (_arr.Length >= capacity)
            return;

        int doubled;
        checked
        {
            try
            {           
                doubled = _arr.Length * 2;
            }
            catch (OverflowException)
            {
                doubled = int.MaxValue;
            }
        }

        var newLength = Math.Max(doubled, capacity);            
        Array.Resize(ref _arr, newLength);
    }

    public bool Remove(T item)
    {
        _lock.EnterUpgradeableReadLock();

        try
        {           
            var i = IndexOfInternal(item);

            if (i == -1)
                return false;

            _lock.EnterWriteLock();
            try
            {   
                RemoveAtInternal(i);
                return true;
            }
            finally
            {               
                _lock.ExitWriteLock();
            }
        }
        finally
        {           
            _lock.ExitUpgradeableReadLock();
        }
    }

    public IEnumerator<T> GetEnumerator()
    {
        _lock.EnterReadLock();

        try
        {    
            for (int i = 0; i < _count; i++)
                // deadlocking potential mitigated by lock recursion enforcement
                yield return _arr[i]; 
        }
        finally
        {           
            _lock.ExitReadLock();
        }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return this.GetEnumerator();
    }

    public int IndexOf(T item)
    {
        _lock.EnterReadLock();
        try
        {   
            return IndexOfInternal(item);
        }
        finally
        {
            _lock.ExitReadLock();
        }
    }

    private int IndexOfInternal(T item)
    {
        return Array.FindIndex(_arr, 0, _count, x => x.Equals(item));
    }

    public void Insert(int index, T item)
    {
        _lock.EnterUpgradeableReadLock();

        try
        {                       
            if (index > _count)
                throw new ArgumentOutOfRangeException("index"); 

            _lock.EnterWriteLock();
            try
            {       
                var newCount = _count + 1;
                EnsureCapacity(newCount);

                // shift everything right by one, starting at index
                Array.Copy(_arr, index, _arr, index + 1, _count - index);

                // insert
                _arr[index] = item;     
                _count = newCount;
            }
            finally
            {           
                _lock.ExitWriteLock();
            }
        }
        finally
        {
            _lock.ExitUpgradeableReadLock();            
        }


    }

    public void RemoveAt(int index)
    {   
        _lock.EnterUpgradeableReadLock();
        try
        {   
            if (index >= _count)
                throw new ArgumentOutOfRangeException("index");

            _lock.EnterWriteLock();
            try
            {           
                RemoveAtInternal(index);
            }
            finally
            {
                _lock.ExitWriteLock();
            }
        }
        finally
        {
            _lock.ExitUpgradeableReadLock();            
        }
    }

    private void RemoveAtInternal(int index)
    {           
        Array.Copy(_arr, index + 1, _arr, index, _count - index-1);
        _count--;

        // release last element
        Array.Clear(_arr, _count, 1);
    }

    public void Clear()
    {
        _lock.EnterWriteLock();
        try
        {        
            Array.Clear(_arr, 0, _count);
            _count = 0;
        }
        finally
        {           
            _lock.ExitWriteLock();
        }   
    }

    public bool Contains(T item)
    {
        _lock.EnterReadLock();
        try
        {   
            return IndexOfInternal(item) != -1;
        }
        finally
        {           
            _lock.ExitReadLock();
        }
    }

    public void CopyTo(T[] array, int arrayIndex)
    {       
        _lock.EnterReadLock();
        try
        {           
            if(_count > array.Length - arrayIndex)
                throw new ArgumentException("Destination array was not long enough.");

            Array.Copy(_arr, 0, array, arrayIndex, _count);
        }
        finally
        {
            _lock.ExitReadLock();           
        }
    }

    public bool IsReadOnly
    {   
        get { return false; }
    }

    public T this[int index]
    {
        get
        {
            _lock.EnterReadLock();
            try
            {           
                if (index >= _count)
                    throw new ArgumentOutOfRangeException("index");

                return _arr[index]; 
            }
            finally
            {
                _lock.ExitReadLock();               
            }           
        }
        set
        {
            _lock.EnterUpgradeableReadLock();
            try
            {

                if (index >= _count)
                    throw new ArgumentOutOfRangeException("index");

                _lock.EnterWriteLock();
                try
                {                       
                    _arr[index] = value;
                }
                finally
                {
                    _lock.ExitWriteLock();              
                }
            }
            finally
            {
                _lock.ExitUpgradeableReadLock();
            }

        }
    }

    public void DoSync(Action<ConcurrentList<T>> action)
    {
        GetSync(l =>
        {
            action(l);
            return 0;
        });
    }

    public TResult GetSync<TResult>(Func<ConcurrentList<T>,TResult> func)
    {
        _lock.EnterWriteLock();
        try
        {           
            return func(this);
        }
        finally
        {
            _lock.ExitWriteLock();
        }
    }

    public void Dispose()
    {   
        _lock.Dispose();
    }
}
Community
  • 1
  • 1
Ronnie Overby
  • 45,287
  • 73
  • 267
  • 346
  • What happens if two threads get into the beginning of the `try` block in `Remove` or the indexer setter at the same time? – James May 20 '15 at 16:07
  • @James that doesn't seem possible. Read remarks at https://msdn.microsoft.com/en-us/library/system.threading.readerwriterlockslim.enterupgradeablereadlock%28v=vs.110%29.aspx. Running this code, you'll never enter that lock a 2nd time: https://gist.github.com/ronnieoverby/59b715c3676127a113c3 – Ronnie Overby May 20 '15 at 21:06
  • @Ronny Overby: Interesting. Given that, I suspect that this would perform much better if you removed the UpgradableReadLock from all functions where the only operation performed in the time between the upgradable read lock and the write lock--the overhead of taking any kind of lock is so much more than the check to see if the parameter is out of range that just doing that check inside the write lock would likely perform better. – James May 20 '15 at 22:38
  • This class also doesn't seem very useful, as the offset-based functions (most of them) can't really ever be used safely unless there is some external locking scheme anyway because the collection could change between when you decide where to put or get something from and when you actually get it. – James May 20 '15 at 22:39
  • @James So don't use it. – Ronnie Overby May 21 '15 at 11:15
  • 1
    I wanted to go on record saying that I recognize that the usefulness of `IList` semantics in concurrent scenarios is limited at best. I wrote this code probably before I came to that realization. My experience is the same as the writer of the accepted answer: I gave it a try with what I knew about synchronization and IList and I learned something by doing that. – Ronnie Overby May 21 '15 at 17:17
1

In sequentially executing code the data structures used are different from (well written) concurrently executing code. The reason is that sequential code implies implicit order. Concurrent code however does not imply any order; better yet it implies the lack of any defined order!

Due to this, data structures with implied order (like List) are not very useful for solving concurrent problems. A list implies order, but it does not clearly define what that order is. Because of this the execution order of the code manipulating the list will determine (to some degree) the implicit order of the list, which is in direct conflict with an efficient concurrent solution.

Remember concurrency is a data problem, not a code problem! You cannot Implement the code first (or rewriting existing sequential code) and get a well designed concurrent solution. You need to design the data structures first while keeping in mind that implicit ordering doesn’t exist in a concurrent system.

Blueprint41
  • 11
  • 1
  • 2
0

lockless Copy and Write approach works great if you're not dealing with too many items. Here's a class I wrote:

public class CopyAndWriteList<T>
{
    public static List<T> Clear(List<T> list)
    {
        var a = new List<T>(list);
        a.Clear();
        return a;
    }

    public static List<T> Add(List<T> list, T item)
    {
        var a = new List<T>(list);
        a.Add(item);
        return a;
    }

    public static List<T> RemoveAt(List<T> list, int index)
    {
        var a = new List<T>(list);
        a.RemoveAt(index);
        return a;
    }

    public static List<T> Remove(List<T> list, T item)
    {
        var a = new List<T>(list);
        a.Remove(item);
        return a;
    }

}

example usage: orders_BUY = CopyAndWriteList.Clear(orders_BUY);

Rob The Quant
  • 347
  • 3
  • 15
  • instead of locking, it creates a copy of the list, modifies the list and set the reference to the new list. So any other threads that are iterating will not cause any problems. – Rob The Quant Apr 03 '19 at 04:27
  • [My answer](https://stackoverflow.com/a/30769838/709537) has a generic solution, and in N.B #1 I mention that you can (and I think should) return [`List.AsReadOnly(modifiedList)`](https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.list-1.asreadonly?redirectedfrom=MSDN&view=netcore-3.1#System_Collections_Generic_List_1_AsReadOnly) in non-generic solutions like yours. That makes it really thread-safe. Of course you will need to change your interface from using `List` to using e.g. `IList`. – Evgeniy Berezovsky Sep 30 '20 at 00:03
-2

I'm surprised no-one has mentioned using LinkedList as a base for writing a specialised class.

Often we don't need the full API's of the various collection classes, and if you write mostly functional side effect free code, using immutable classes as far as possible, then you'll actually NOT want to mutate the collection favouring various snapshot implementations.

LinkedList solves some difficult problems of creating snapshot copies/clones of large collections. I also use it to create "threadsafe" enumerators to enumerate over the collection. I can cheat, because I know that I'm not changing the collection in any way other than appending, I can keep track of the list size, and only lock on changes to list size. Then my enumerator code simply enumerates from 0 to n for any thread that wants a "snapshot" of the append only collection, that will be guaranteed to represent a "snapshot" of the collection at any moment in time, regardless of what other threads are appending to the head of the collection.

I'm pretty certain that most requirements are often extremely simple, and you need 2 or 3 methods only. Writing a truly generic library is awfully difficult, but solving your own codes needs can sometimes be easy with a trick or two.

Long live LinkedList and good functional programming.

Cheers, ... love ya all! Al

p.s. sample hack AppendOnly class here : https://github.com/goblinfactory/AppendOnly

snowcode
  • 1,033
  • 10
  • 24
  • Do you have any benchmarks that compare your [`AppendOnlyList`](https://github.com/goblinfactory/AppendOnly) class with what seems like its closest immutable competitor, the [`ImmutableQueue`](https://learn.microsoft.com/en-us/dotnet/api/system.collections.immutable.immutablequeue-1) class? How much faster is your class, and how much less allocatey it is? We need to see the numbers! – Theodor Zoulias Feb 17 '21 at 01:03
  • they solve different problems. In my case I needed an Immutable append only class, and as per my comment, i was easily able to achieve my design goals by writing a few lines of code on top of a linked list and achieve threadsafety, concurrency and immutability. Go check it out, its just a sample to get you inspired to write something yourself. My comment wasn't about "hey my code is the best", it was .. hey, why hasnt anyone mentioned LinkedList in any of the discussions? – snowcode Feb 17 '21 at 23:35
  • Your [`AppendOnlyList`](https://github.com/goblinfactory/AppendOnly/blob/master/src/Lib/AppendOnlyList.cs) class supports adding (`AddItem`) and enumerating (`Items`). These two operations are also supported by the `ImmutableQueue` class (`Enqueue`, `GetEnumerator`), so why do you say that these two classes solve different problems? They don't. So the question becomes: which is faster and less allocatey? Do you want me to do the benchmarks? – Theodor Zoulias Feb 17 '21 at 23:41
  • In terms of "allocatey" since it's just a linkedList under the covers, you're really asking "how efficient (less allocatey) is LinkedList". Under the covers I'd wager the ImmutableQueue is also using a LinkedList considering the API. But since my particular use case needs a mutable appendOnly list, I can't use ImmutableQueue and had no reason to spend time comparing them. You're welcome to do it, I'll buy you a coffee if it's faster, or "less allocatey" :D cheers, A – snowcode Feb 17 '21 at 23:44
  • How is it mutable? AFAICS the `AppendOnlyList` has no API that mutates an instance of the class. The `AddItem` method returns a new `AppendOnlyList`, just like the `ImmutableQueue.Enqueue` returns a new `ImmutableQueue`. – Theodor Zoulias Feb 17 '21 at 23:48
  • @TheodorZoulias I'd love a benchmark, that would be totally awesome :D p.s. make sure you include "Length" counts :D (wink) – snowcode Feb 17 '21 at 23:49
  • Snowcode TBH I am pretty confident that the `ImmutableQueue` is at least as fast as your class, and not more allocatey. And because of this I have no incentive to write benchmarks. Honestly I think that it's a work that you should do yourself, in order to prove that your idea is worthy. – Theodor Zoulias Feb 17 '21 at 23:58
  • I am not trying to prove it, I just shared my code as an example of a custom implementation based on LinkedList that's absolutely trivial and encourage folk to consider many other applications of the very versatile and very under-utilised data structure. Yes, ImmutableQueue is also a good base class also to use as a starting block, I never said it wasn't, look at them both. – snowcode Feb 18 '21 at 13:49
  • Snowcode currently if I have a need for a data structure that support adding items, preserves their insertion order, and is immutable, I am not going to start searching around about whether the `LinkedList` or the `ArrayList` or the `SortedDictionary` or the `BitArray` is a good starting point for creating a *custom* class having these features. With that mindset I'll never get anything done. Instead I'll pick the data structure that offers these features out of the box, and I'll go one with my project. I don't know if anyone is persuaded by your arguments, but I'm certainly not. – Theodor Zoulias Feb 18 '21 at 14:05
  • I hear you, and you're absolutely right. I think we are talking cross purposes. The OP cannot use an `ImmutableQueue` as a replacement for `ConcurrentList` for all the reasons the good folk have mentioned above, e.g. insertAt, removeAt etc, which promoted a lot of interesting discussions around both using existing stuff as well as writing new things, which is why I "mused" over the "interesting to me" wondering, why has not one mentioned LinkedList? If even only to debunk it's use in different situations. Your comments on ImmutableQueue are perfectly valid. – snowcode Feb 18 '21 at 15:39
  • 1
    *"Why has not one mentioned LinkedList?"* This is an impossible to answer question, but if I had to guess I would say because nobody likes to use this class in general, most people have never used it, and so it doesn't popup as a solution in their mind when they have a problem to solve. Another possibility is that anyone who tried to use this class to do anything practical was disappointed by its abysmal performance (compared to array-backed structures like the `List`), so they decided not to waste their time again with this class. But these are just my wild guesses of course. – Theodor Zoulias Feb 18 '21 at 16:31
  • 1
    :D @TheodorZoulias drat... now you've got me curious and I'm definitely going to have to do that perf comparison. nice thread, I'll post an update if/when I get around to it. – snowcode Feb 18 '21 at 17:24