13

Consider a simple Registry class accessed by multiple threads:

public class Registry
{
    protected readonly Dictionary<int, string> _items = new Dictionary<int, string>();
    protected readonly object _lock = new object();

    public void Register(int id, string val)
    {
        lock(_lock)
        {
           _items.Add(id, val);
        }
    }

    public IEnumerable<int> Ids
    {
        get
        {
            lock (_lock)
            {
                return _items.Keys;
            }
        }
    }
}

and typical usage:

var ids1 = _registry.Ids;//execution deferred until line below
var ids2 = ids1.Select(p => p).ToArray();

This class is not thread safe as it's possible to receive System.InvalidOperationException

Collection was modified; enumeration operation may not execute.

when ids2 is assigned if another thread calls Register as the execution of _items.Keys is not performed under the lock!

This can be rectified by modifying Ids to return an IList:

public IList<int> Ids
    {
        get
        {
            lock (_lock)
            {
                return _items.Keys.ToList();
            }
        }
    }

but then you lose a lot of the 'goodness' of deferred execution, for example

var ids = _registry.Ids.First();  //much slower!

So,
1) In this particular case are there any thread-safe options that involve IEnumerable
2) What are some best practices when working with IEnumerable and locks ?

wal
  • 17,409
  • 8
  • 74
  • 109
  • 3
    The problem is not the return value of your Property IDs, its that you refer to Dictionary.Keys which is a shared instance (each call on the same dictionary instance returns the same key collection instance). Either make a copy like your second implementation or use the ConcurrentDictionary class – Polity Feb 01 '12 at 13:15
  • 4
    Just use `ConcurrentDictionary` instead and dump the locking altogether: http://msdn.microsoft.com/en-us/library/dd287191.aspx – LukeH Feb 01 '12 at 13:15
  • 1
    Note that you're returing the key collection of the dictionary in your first snippet, it's not a streaming sequence. It's not that much unlike your List version in that respect. – Anthony Pegram Feb 01 '12 at 13:17
  • And, by the way, your second version isn't threadsafe either. There's no lock in your `Register` method, so nothing stopping an item being added to the list in the middle of the `ToList` iteration inside your `Ids` property. – LukeH Feb 01 '12 at 13:18
  • @LukeH re: no lock on `Register`, duly noted, this is just a spike. updated code – wal Feb 01 '12 at 13:20

3 Answers3

9

When your Ids property is accessed then the dictionary cannot be updated, however there is nothing to stop the Dictionary from being updated at the same time as LINQ deferred execution of the IEnumerator<int> it got from Ids.

Calling .ToArray() or .ToList() inside the Ids property and inside a lock will eliminate the threading issue here so long as the update of the dictionary is also locked. Without locking both update of the dictionary and ToArray(), it is still possible to cause a race condition as internally .ToArray() and .ToList() operate on IEnumerable.

In order to resolve this you need to either take the performance hit of ToArray inside a lock, plus lock your dictionary update, or you can create a custom IEnumerator<int> that itself is thread safe. Only through control of iteration (and locking at that point), or through locking around an array copy can you achieve this.

Some examples can be found below:

wal
  • 17,409
  • 8
  • 74
  • 109
Dr. Andrew Burnett-Thompson
  • 20,980
  • 8
  • 88
  • 178
5

Just use ConcurrentDictionary<TKey, TValue>.

Note that ConcurrentDictionary<TKey, TValue>.GetEnumerator is thread-safe:

The enumerator returned from the dictionary is safe to use concurrently with reads and writes to the dictionary

jason
  • 236,483
  • 35
  • 423
  • 525
  • This is *the* answer, assuming the OP has .NET4 at their disposal. – LukeH Feb 01 '12 at 13:54
  • 1
    That's because it probably creates a copy of the collection? OP already knows that this is a fail-safe solution. – vgru Feb 01 '12 at 13:58
  • i dont actually, limited to .net3.5, i have looked at the source code for `ConcurrentDictionary.Keys` and it returns a new List for each call (safely obviously) so I guess I will do the same under the lock. Whats not great about that is that it is slower but this may be the trade-off for thread-safety – wal Feb 01 '12 at 13:59
  • +1 fair enough, I've just read it too. I don't think I like it, though; it's like plain `GetEnumerator` with locking, but without the "was collection modified?" check. – vgru Feb 01 '12 at 14:01
  • 1
    @Groo: (Sorry, accidentally deleted my comment.) `GetEnumerator` does not copy the collection. Per the documentation: "[`GetEnumerator`] does not represent a moment-in-time snapshot of the dictionary. The contents exposed through the enumerator may contain modifications made to the dictionary after GetEnumerator was called." – jason Feb 01 '12 at 14:03
  • @wal: Then get the Reactive extensions. They backported `ConcurrentDictionary`. – jason Feb 01 '12 at 14:12
  • I wonder if this method might be prone to deadlocks when enumerator is not disposed as @LukeH mentioned below. It certainly looks so, I will check inside reflector. – vgru Feb 01 '12 at 15:30
1

If you use yield return inside the property, then compiler will ensure that the lock is taken on first call to MoveNext(), and released when the enumerator is disposed.

I would avoid using this, however, because poorly implemented caller code might forget to call Dispose, creating a deadlock.

public IEnumerable<int> Ids     
{
    get      
    {
        lock (_lock)             
        {
            // compiler wraps this into a disposable class where 
            // Monitor.Enter is called inside `MoveNext`, 
            // and Monitor.Exit is called inside `Dispose`

            foreach (var item in _items.Keys)
               yield return item;
        }         
    }     
} 
vgru
  • 49,838
  • 16
  • 120
  • 201
  • 2
    I dont believe it does. i've just tested this with my spike and it fails also. the yield return is only hit when assigning `ids2` in the example above, ie, outside the lock – wal Feb 01 '12 at 13:32
  • @Groo no, it will release the lock as soon as it exits the method. the method itself just returns a new enumerable. – Polity Feb 01 '12 at 13:47
  • @Groo: The compiler *does not* release the lock between each yield. It just doesn't take the lock until the first `MoveNext` call on the enumerator object. Which itself leads to another problem with this code: A malicious and/or stupid caller could just do `Ids.GetEnumerator().MoveNext()` and that lock is then held forever; the object becomes unusable. – LukeH Feb 01 '12 at 13:48
  • 1
    @Polity: This particular method is an iterator, so the "method" doesn't actually exit until the calling code has finished enumerating the sequence. (And, similarly, the "method" doesn't actually start executing until the first call to the enumerator's `MoveNext` method.) – LukeH Feb 01 '12 at 13:50
  • @Polity: it *is* different than simply returning the reference, because it *locks* during iteration. Like LukeH wrote, I also thought the it keeps the lock during iteration (that makes sense), but I saw [this answer](http://stackoverflow.com/a/2847675/69809) and misinterpreted the results (I've glanced over the code and didn't notice that there is no threading involved). Yes, this is thread safe for concurrent access, but prone to deadlocks for badly implemented callers. – vgru Feb 01 '12 at 13:56
  • @wal: how did it fail? I've just tested the first snippet and it seems to be working without problems. – vgru Feb 01 '12 at 14:12
  • @Groo 'collection was modified' ie _items.Keys is modified when calling `Register` whilst ids2 is assigned. Can you or @LukeH elaborate on what Luke said, ie >It just doesn't take the lock until the first MoveNext call on the enumerator object. How does it magically 'look up' to take that lock ? – wal Feb 01 '12 at 14:17
  • @wal: it means that the lock will be held during iteration, but it will only be acquired when the iteration starts (lazy evaluation). There shouldn't be any thread-safety issues with that, but the releasing of the lock is done when enumerator is disposed, so one needs to take care of that if you are iterating explicitly (using `MoveNext`/`Current`). If you use LINQ methods, they all dispose it properly, so it should work. I will add a test example which I just ran, and there were no problems. (note that `Register` method still needs a lock, as in your last edit). – vgru Feb 01 '12 at 14:22
  • @Groo yes i understand that but *how* does it know to lock(_object) again given the execution is past this point? – wal Feb 01 '12 at 14:28
  • @wal: there is a lot going on behind the scenes with `yield return`. You can google it, there are several good articles. The point is that compiler generates a hidden class inside `Registry`, which implements `IEnumerable` and returns this hidden class' instance immediately. When you start iterating using LINQ or a `foreach` method, your code will first access the `MoveNext()` method of the returned hidden object. Inside **that** method, there is a call to `Monitor.Enter`, which acquires the lock. Once external code is done iterating, it needs to call `Dispose()`, where lock is released. – vgru Feb 01 '12 at 14:38
  • @wal: the hidden class instance also holds the reference to your `Registry` class and can therefore acquire the same `_lock` instance used throughout `Registry`. – vgru Feb 01 '12 at 14:42
  • @Groo confirmed your solution works ok, i hadnt updated the locking on the `Register` method – wal Feb 02 '12 at 04:19