TL;DR: Is it possible for a single enumeration of a ConcurrentDictionary
, to emit the same key twice? Does the current implementation of the ConcurrentDictionary
class (.NET 5) allow this possibility?
I have a ConcurrentDictionary<string, decimal>
that is mutated by multiple threads concurrently, and I want periodically to copy it to a normal Dictionary<string, decimal>
, and pass it to the presentation layer for updating the UI. There are two ways to copy it, with and without snapshot semantics:
var concurrent = new ConcurrentDictionary<string, decimal>();
var copy1 = new Dictionary<string, decimal>(concurrent.ToArray()); // Snapshot
var copy2 = new Dictionary<string, decimal>(concurrent); // On-the-go
I am pretty sure that the first approach is safe, because the ToArray
method returns a consistent view of the ConcurrentDictionary
:
Returns a new array containing a snapshot of key and value pairs copied from the
ConcurrentDictionary<TKey,TValue>
.
But I would prefer to use the second approach, because it generates less contention.
I am worried though about the possibility of getting an ArgumentException: An item with the same key has already been added.
The documentation doesn't seem to exclude this possibility:
The enumerator returned from the dictionary ... does not represent a moment-in-time snapshot of the dictionary. The contents exposed through the enumerator may contain modifications made to the dictionary after
GetEnumerator
was called.
Here is the scenario that makes me worried:
- The thread A starts enumerating the
ConcurrentDictionary
, and the keyX
is emitted by the enumerator. Then the thread is temporarily suspended by the OS. - The thread B removes the key
X
. - The thread C adds a new entry with the key
X
. - The thread A resumes enumerating the
ConcurrentDictionary
, the enumerator observes the newly addedX
entry, and emits it. - The constructor of the
Dictionary
class attempts to insert twice the keyX
into the newly constructedDictionary
, and throws an exception.
I tried to reproduce this scenario, without success. But this is not 100% reassuring, because the conditions that could cause this situation to emerge could be subtle. Maybe the values I added didn't have the "right" hashcodes, or didn't generate the "right" number of hashcode collisions. I tried to find an answer by studying the source code of the class, but unfortunately it's too complicated for me to understand.
My question is: is it safe, based on the current implementation (.NET 5), to create fast copies of my ConcurrentDictionary
by enumerating it directly, or should I code defensively and take a snapshot every time I copy it?
Clarification: I would agree with anyone who says that using an API taking into consideration its undocumented implementation details is unwise. But alas, this is what this question is all about. It's a rather educational, out of curiosity question. I am not intending to use the acquired knowledge in production code, I promise.