10

I want to use a ConcurrentDictionary in my app, but first I need to make sure I understand correctly how it works. In my app, I'll have one or more threads that write to, or delete from, the dictionary. And, I'll have one or more threads that read from the dictionary. Potentially, all at the same time.

Am I correct that the implementation of ConcurrentDictionary takes care of all the required locking for this to happen, and I don't need to provide my own locking? In other words, if one thread is writing to, or deleting from, the dictionary, a reading thread (or another write thread) will be blocked until the update or delete is finished?

Thanks very much.

Randy Minder
  • 47,200
  • 49
  • 204
  • 358
  • 1
    What does the documentation for `ConcurrentDictionary` say on the subject? I would imagine the summary of the object would discuss whether or not it is thread safe, and a degree of how safe it is. – Servy Aug 15 '12 at 13:58

2 Answers2

8

The current implementation uses a mixture of striped locks (the technique I suggested in an answer to someone yesterday at https://stackoverflow.com/a/11950835/400547) and thinking very very hard about the situations in which an operation cannot possibly cause problems for or have problems cause by, a concurrent operation (there's quite a lot of these, but you have to be very sure if you make use of them).

As such if you have several operations happening on the concurrent dictionary at once, each of the following is possible:

  1. No threads even lock, but everything happens correctly.
  2. Some threads lock, but they lock on separate things, and there is no lock contention.
  3. One or two threads have lock contention with each other, and are slowed down, but the effect upon performance is less than if there were a single lock.
  4. One or two threads need to lock the entire thing for a while (generally for internal resizing) which blocks all the threads that could possibly be blocked in case 3 above, though some can keep going (those that read).

None of this involves dirty reads, which is a matter only vaguely related to locking (my own form of concurrent dictionary uses no locks at all, and it doesn't have dirty reads either).

This thread-safety doesn't apply to batches done by your code (if you read a value and then write a value, the value read may have changed before you finished the write), but note that some common cases which would require a couple of calls on Dictionary are catered for by single methods on ConcurrentDictionary (GetOrAdd and AddOrUpdate do things that would be two calls with a Dictionary so they can be done atomically - though note that the Func involved in some overloads may be called more than once).

Due to this, there's no added danger with ConcurrentDictionary, so you should pick as follows:

If you're going to have to lock over some batches of operations that don't match what ConcurrentDictionary offers like e.g.:

lock(lockObj)
{
  var test = dict[key1];
  var test2 = dict[key2];
  if(test < test2 && test2 < dict[key3] && SomeOtherBooleanProducer())
    dict[key4] = SomeFactoryCall(key4);
}

Then you would have to lock on ConcurrentDictionary, and while there may be a way to combine that with what it offers in the way of support for concurrency, there probably won't, so just use Dictionary with a lock.

Otherwise it comes down to how much concurrent hits there will probably be. If you're mostly only going to have one thread hitting the dictionary, but you need to guard against the possibility of concurrent access, then you should definitely go for Dictionary with a lock. If you're going to have periods where half a dozen or more threads are hitting the dictionary, then you should definitely go for ConcurrentDictionary (if they're likely to be hitting the same small number of keys then take a look at my version because that's the one situation where I have better performance).

Just where the middle point between "few" and "many" threads lies, is hard to say. I'd say that if there are more than two threads on a regular basis then go with ConcurrentDictionary. If nothing else, demands from concurrency tend to increase throughout the lifetime of a project more often than they decrease.

Edit: To answer about the particular case you give, of one writer and one reader, there won't be any blocking at all, as that is safe for roughly the same reason why multiple readers and one writer is safe on Hashtable, though ConcurrentDictionary goes beyond that in several ways.

Community
  • 1
  • 1
Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
6

In other words, if one thread is writing to, or deleting from, the dictionary, a reading thread (or another write thread) will be blocked until the update or delete is finished?

I don't believe it will block - it will just be safe. There won't be any corruption - you'll just have a race in terms of whether the read sees the write.

From a FAQ about the lock-free-ness of the concurrent collections:

ConcurrentDictionary<TKey,TValue> uses fine-grained locking when adding to or updating data in the dictionary, but it is entirely lock-free for read operations. In this way, it’s optimized for scenarios where reading from the dictionary is the most frequent operation.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Thank you. It sounds like it's possible to have 'dirty reads' (using SQL terms). For my app, this is ok. I guess if I wanted to completely avoid this possibility, I wouldn't use a ConcurrentDictionary, I would use the non-concurrent one and implement my own locking. – Randy Minder Aug 15 '12 at 13:03
  • 1
    @RandyMinder: No, I don't think dirty reads are possible - even with locking you'd still have the data race, wouldn't you? Either the read happens before the write, or it happens after. All the lock-free-ness means is that a write which *starts* before a read could still not be seen before the read. A write which *ends* before the read starts should always be visible, I believe. – Jon Skeet Aug 15 '12 at 13:05
  • 1
    I could test this easily enough to find out. But, for my purposes, it's a non-issue. The way ConcurrentDictionary is implemented will work for me. Do you have plans to be a CodeMash 2013 next year (in Sandusky, OH)? – Randy Minder Aug 15 '12 at 13:08
  • @RandyMinder: Hoping to be, yes. We'll see :) – Jon Skeet Aug 15 '12 at 13:10
  • 4
    @RandyMinder `ConcurrentDictionary` is only guaranteed safe for *its* operations. you can use `ConcurrentDictionary` in ways that are not thread-safe for your code. e.g. `foreach(var pair in concurrentDictionary) DoSomethingWithPair(pair);` may or may not be thread-safe in your app. – Peter Ritchie Aug 15 '12 at 13:14
  • @PeterRitchie You could say that all of it's operations are atomic, which is much more precise than saying that they're "thread safe", because [thread safe doesn't really tell you much](http://blogs.msdn.com/b/ericlippert/archive/2009/10/19/what-is-this-thing-you-call-thread-safe.aspx). – Servy Aug 15 '12 at 14:01
  • @Servy I never said they were "thread-safe", I just reused Jon's "safe". – Peter Ritchie Aug 15 '12 at 14:03
  • @PeterRitchie I was agreeing with what you were trying to say, and suggesting a more appropriate term which more accurately describes the point you were making because the term 'safe' is just too vague. – Servy Aug 15 '12 at 14:06
  • @Servy I agree that "thread-safe" is contextual and in the case of `ConcurrentDictionary` only applies to `ConcurrentDictionary`. Which is why I purposely didn't use "thread safe". I trust Jon used "safe" for specific reasons and re-used it not to question it's vagueness. I don't think I'd even go so far as to say `ConcurrentDictionary` operations are "atomic" either. Jon's link says `ConcurrentDictionary` uses fine-grained locking for add/update. I don't know what they do, so I can't say that means "atomic"--they're not documented as "atomic". – Peter Ritchie Aug 15 '12 at 14:18
  • @PeterRitchie They're logically atomic, which is what Jon explained. You can be sure that, from an external point of view, if you call a number of operations "at the same time" you can be sure that all externally visible effects of one of them will happen first, then all externally visible effects of another one will happen next, and so on. If, internally, some of the internal, intermediate operations happen to overlap it will be done so such that there is no way to know that externally. The implementation allows you to treat each operation as if it were fully atomic. – Servy Aug 15 '12 at 14:23
  • Your belief is correct, in that it definitely won't block in that specific situation, though it does block in some others. – Jon Hanna Aug 15 '12 at 14:36