The ConcurrentDictionary<TKey,TValue>
collection is surprisingly difficult to master. The pitfalls that are waiting to trap the unwary are numerous and subtle. Here are some of them:
- Giving the impression that the
ConcurrentDictionary<TKey,TValue>
blesses with thread-safery everything it contains. That's not true. If the TValue
is a mutable class, and is allowed to be mutated by multiple threads, it can be corrupted just as easily as if it wasn't contained in the dictionary.
- Using the
ConcurrentDictionary<TKey,TValue>
with patterns familiar from the Dictionary<TKey,TValue>
. Race conditions can trivially emerge. For example if (dict.Contains(x)) list = dict[x]
is wrong. In a multithreaded environment it is entirely possible that the key x will be removed between the dict.Contains(x)
and the list = dict[x]
, resulting in a KeyNotFoundException
. The ConcurrentDictionary<TKey,TValue>
is equiped with special atomic APIs that should be used instead of the previous chatty check-then-act pattern.
- Using the
Count == 0
for checking if the dictionary is empty. The Count
property is very cheep for a Dictionary<TKey,TValue>
, and very expensive for a ConcurrentDictionary<TKey,TValue>
. The correct property to use is the IsEmpty
.
- Assuming that the
AddOrUpdate
method can be safely used for updating a mutable TValue
object. This is not a correct assumption. The "Update" in the name of the method means "update the dictionary, by replacing an existing value with a new value". It doesn't mean "modify an existing value".
- Assuming that enumerating a
ConcurrentDictionary<TKey,TValue>
will yield the entries that were stored in the dictionary at the point in time that the enumeration started. That's not true. The enumerator does not maintain a snapshot of the dictionary. The behavior of the enumerator is not documented precisely. It's not even guaranteed that a single enumeration of a ConcurrentDictionary<TKey,TValue>
will yield unique keys. In case you want to do an enumeration with snapshot semantics you must first take a snapshot explicitly with the (expensive) ToArray
method, and then enumerate the snapshot. You might even consider switching to an ImmutableDictionary<TKey,TValue>
, which is exceptionally good at providing these semantics.
- Assuming that calling extension methods on
ConcurrentDictionary<TKey,TValue>
s interfaces is safe. This is not the case. For example the ToArray
method is safe because it's a native method of the class. The ToList
is not safe because it is a LINQ extension method on the IEnumerable<KeyValuePair<TKey,TValue>>
interface. This method internally first calls the Count
property of the ICollection<KeyValuePair<TKey,TValue>>
interface, and then the CopyTo
of the same interface. In a multithread environment the Count
obtained by the first operation might not be compatible with the second operation, resulting in either an ArgumentException
, or a list that contains empty elements at the end.
In conclusion, migrating from a Dictionary<TKey,TValue>
to a ConcurrentDictionary<TKey,TValue>
is not trivial. In many scenarios sticking with the Dictionary<TKey,TValue>
and adding synchronization around it might be an easier (and safer) path to thread-safety. IMHO the ConcurrentDictionary<TKey,TValue>
should be considered more as a performance-optimization over a synchronized Dictionary<TKey,TValue>
, than as the tool of choice when a dictionary is needed in a multithreading scenario.