3

I'm currently implementing a thread-safe dictionary in C# which uses immutable AVL trees as buckets internally. The idea is to provide fast read access without a lock because in my application context, we add entries to this dictionary only at startup and afterwards, values are mostly read (but there still are a few number of writes).

I've structured my TryGetValue and GetOrAdd methods in the following way:

public sealed class FastReadThreadSafeDictionary<TKey, TValue> where TKey : IEquatable<TKey>
{
    private readonly object _bucketContainerLock = new object();
    private ImmutableBucketContainer<TKey, TValue> _bucketContainer;

    public bool TryGetValue(TKey key, out TValue value)
    {
        var bucketContainer = _bucketContainer;
        return bucketContainer.TryFind(key.GetHashCode(), key, out value);
    }

    public bool GetOrAdd(TKey key, Func<TValue> createValue, out TValue value)
    {
        createValue.MustNotBeNull(nameof(createValue));
        var hashCode = key.GetHashCode();
        lock (_bucketContainerLock)
        {
            ImmutableBucketContainer<TKey, TValue> newBucketContainer;
            if (_bucketContainer.GetOrAdd(hashCode, key, createValue, out value, out newBucketContainer) == false)
                return false;

            _bucketContainer = newBucketContainer;
            return true;
        }
    }

    // Other members omitted for sake of brevity
}

As you can see, I don't use a lock in TryGetValue because reference assignment in .NET runtimes is an atomic operation by design. By copying the reference of the field _bucketContainer to a local variable, I'm sure I can safely access the instance because it is immutable. In GetOrAdd, I use a lock to access the private _bucketContainer so I can ensure that a value is not created twice (i.e. if two or more threads are trying to add a value, only one can actually create a new ImmutableBucketContainer with the added value because of the lock).

I use Microsoft Chess for testing concurrency and in one of my tests, MCUT (Microsoft Concurrency Unit Testing) reports a data race in GetOrAdd when I exchange the new bucket container with the old one:

[DataRaceTestMethod]
public void ReadWhileAdd()
{
    var testTarget = new FastReadThreadSafeDictionary<int, object>();
    var writeThread = new Thread(() =>
                                 {
                                     for (var i = 5; i < 10; i++)
                                     {
                                         testTarget.GetOrAdd(i, () => new object());
                                         Thread.Sleep(0);
                                     }
                                 });
    var readThread = new Thread(() =>
                                {
                                    object value;
                                    testTarget.TryGetValue(5, out value);
                                    Thread.Sleep(0);
                                    testTarget.TryGetValue(7, out value);
                                    Thread.Sleep(10);
                                    testTarget.TryGetValue(9, out value);
                                });
    readThread.Start();
    writeThread.Start();
    readThread.Join();
    writeThread.Join();
}

MCUT reports the following message:

23> Test result: DataRace 23> ReadWhileAdd() (Context=, TestType=MChess): [DataRace]Found data race at GetOrAdd:FastReadThreadSafeDictionary.cs(68)

which is the assignment _bucketContainer = newBucketContainer; in GetOrAdd.

My actual question is: why is the assignment _bucketContainer = newBucketContainer a race condition? Threads currently executing TryGetValue always make a copy of the _bucketContainer field and thus shouldn't be bothered with the update (except that the searched value might be added to the _bucketContainer just after the copy takes place, but this doesn't matter with the data race). And in GetOrAdd, there is an explicit lock to prevent concurrent access. Is this a bug in Chess or am I missing something very obvious?

Community
  • 1
  • 1
feO2x
  • 5,358
  • 2
  • 37
  • 46
  • 1
    You didn't use a volatile read and you might need a memory barrier between constructing the new state and assigning it to the field. Not sure about the .net 2.0 memory model, but I think both of these are necessary in the ECMA memory model. – CodesInChaos Sep 22 '16 at 08:11
  • @CodesInChaos I've added a `Volatile.Read` call to `TryGetValue` and it makes the test pass (thanks!). Still, I don't get why this is a problem because `Volatile.Read` just ensures that the value is read from memory and not from a CPU register that might cache it. As the bucket container itself is immutable, why would this result in a race condition? `TryGetValue` might just use an old version of it in some cases, but overall, the performance should be quite better than with `Volatile.Read`. – feO2x Sep 22 '16 at 08:17
  • 1
    I also don't understand why you'd get a race condition in this situation (given the C# language guarantee that reference reads and writes are atomic). You could get a stale value (for example, if the reference had been updated in a register but not yet copied back to main memory) but that's not really a race condition in the context that you are using the field. – Matthew Watson Sep 22 '16 at 08:34
  • 1
    @MatthewWatson The issue isn't about atomicity, but about reordering. The ECMA memory model allows a lot of reordering, the .net 2.0 model mirrors x86 and allows less, but there is still reordering. In weak models you need one memory fence between creating the object and publishing it and another between reading it and using it (the latter is implicitly part of the volatile read, I believe). Otherwise the reader could see an incompletely constructed version of the object despite the pointer to it only getting published after complete construction. – CodesInChaos Sep 22 '16 at 08:53
  • 1
    @CodesInChaos In the OP's code a reference to the new instance of `ImmutableBucketContainer` is returned from `_bucketContainer.GetOrAdd()`. Are you saying that this reference could subsequently be assigned to `_bucketContainer` such that another thread which is (atomically) copying `_bucketContainer` could end up with a reference to an object for which the constructor hadn't completely run? – Matthew Watson Sep 22 '16 at 09:19
  • @MatthewWatson I think that CPUs might try to reorder the `_bucketContainer.GetOrAdd` call and the `_bucketContainer = newBucketContainer;` assignment - however, are they even allowed to do that, as `newBucketContainer` is set as an `out` parameter of `GetOrAdd`? I highly doubt that as, in the end, this would mean that they reorder instructions residing in different methods. – feO2x Sep 22 '16 at 09:32
  • does it still find the race if you lock(_bucketContainerLock) around var bucketContainer = _bucketContainer; return bucketContainer.TryFind(key.GetHashCode(), key, out value); – Jason Hernandez Oct 14 '16 at 21:23
  • @JasonHernandez It will not race if I put a lock in `TryGetValue` - however, that's exactly what I'm trying to avoid. I want non-locking read access to this dictionary, that's why `ImmutableBucketContainer` is immutable (as the name says): I'm guaranteed to have an unchanging instance when I read the value from the field to the variable. I actually solved it by using a volatile read as proposed by @CodesInChaos. – feO2x Oct 16 '16 at 05:01

1 Answers1

0

As mentioned by @CodesInChaos in the comments of the question, I missed a volatile read in TryGetValue. The method now looks like this:

public bool TryGetValue(TypeKey typeKey, out TValue value)
{
    var bucketContainer = Volatile.Read(ref _bucketContainer);
    return bucketContainer.TryFind(typeKey, out value);
}

This volatile read is necessary because different threads accessing this dictionary might cache data and reorder instructions independently from each other, which might lead to a data race. Additionally, the CPU architecture that is running the code also matters, e.g. x86 and x64 processors perform volatile reads by default, while this might not be true for other architectures like ARM or Itanium. That's why the read access has to be synchronized with other threads using a Memory Barrier, which is performed internally in Volatile.Read (note that lock statements also use memory barriers internally). Joseph Albahari wrote a comprehensive tutorial on this here: http://www.albahari.com/threading/part4.aspx

feO2x
  • 5,358
  • 2
  • 37
  • 46