151

I couldn't find enough information on ConcurrentDictionary types, so I thought I'd ask about it here.

Currently, I use a Dictionary to hold all users that is accessed constantly by multiple threads (from a thread pool, so no exact amount of threads), and it has synchronized access.

I recently found out that there was a set of thread-safe collections in .NET 4.0, and it seems to be very pleasing. I was wondering, what would be the 'more efficient and easier to manage' option, as I have the option between having a normal Dictionary with synchronized access, or have a ConcurrentDictionary which is already thread-safe.

Reference to .NET 4.0's ConcurrentDictionary

Luke Girvin
  • 13,221
  • 9
  • 64
  • 84
TheAJ
  • 10,485
  • 11
  • 38
  • 57
  • 3
    You may want to see [this code project article on the subject](http://www.codeproject.com/Articles/548406/Dictionary-plus-Locking-versus-ConcurrentDictionar) – nawfal Nov 08 '13 at 07:18

7 Answers7

165

A thread-safe collection vs. a non-threadsafe-collection can be looked upon in a different way.

Consider a store with no clerk, except at checkout. You have a ton of problems if people don't act responsibly. For instance, let's say a customer takes a can from a pyramid-can while a clerk is currently building the pyramid, all hell would break loose. Or, what if two customers reaches for the same item at the same time, who wins? Will there be a fight? This is a non-threadsafe-collection. There's plenty of ways to avoid problems, but they all require some kind of locking, or rather explicit access in some way or another.

On the other hand, consider a store with a clerk at a desk, and you can only shop through him. You get in line, and ask him for an item, he brings it back to you, and you go out of the line. If you need multiple items, you can only pick up as many items on each roundtrip as you can remember, but you need to be careful to avoid hogging the clerk, this will anger the other customers in line behind you.

Now consider this. In the store with one clerk, what if you get all the way to the front of the line, and ask the clerk "Do you have any toilet paper", and he says "Yes", and then you go "Ok, I'll get back to you when I know how much I need", then by the time you're back at the front of the line, the store can of course be sold out. This scenario is not prevented by a threadsafe collection.

A threadsafe collection guarantees that its internal data structures are valid at all times, even if accessed from multiple threads.

A non-threadsafe collection does not come with any such guarantees. For instance, if you add something to a binary tree on one thread, while another thread is busy rebalancing the tree, there's no guarantee the item will be added, or even that the tree is still valid afterwards, it might be corrupt beyond hope.

A threadsafe collection does not, however, guarantee that sequential operations on the thread all work on the same "snapshot" of its internal data structure, which means that if you have code like this:

if (tree.Count > 0)
    Debug.WriteLine(tree.First().ToString());

you might get a NullReferenceException because inbetween tree.Count and tree.First(), another thread has cleared out the remaining nodes in the tree, which means First() will return null.

For this scenario, you either need to see if the collection in question has a safe way to get what you want, perhaps you need to rewrite the code above, or you might need to lock.

Lasse V. Karlsen
  • 380,855
  • 102
  • 628
  • 825
  • 9
    Everytime I read this article, even though it's all true, that somehow we are going down the wrong path. – scope_creep Sep 08 '10 at 21:43
  • 23
    The main problem in a thread-enabled application is mutable data structures. If you can avoid that, you'll save yourself heaps of trouble. – Lasse V. Karlsen Sep 09 '10 at 07:20
  • Apparently dictionaries can have some really nasty threading side-effects: http://stackoverflow.com/questions/14838032/asp-net-hang-generic-dictionary-concurrency-issues-causes-gc-deadlock – jocull Feb 25 '13 at 21:08
  • 1
    @LasseV.Karlsen: Threading is generally of limited usefulness in the complete absence of shared mutable state, unless everything a thread does can be completely described by its return value. The key to making threading work is not to make everything immutable, but rather to have every shared object either have immutable state or immutable identity. If one thread changes the identity of an object, the only way another thread can find out about it is if its identity is encapsulated in the state of some other object whose identity is immutable. – supercat Apr 26 '13 at 19:28
  • 126
    This is a nice explanation of thread safety, however I can't help but feel like this didn't actually address the OP's question. What the OP asked (and what I subsequently came across this question looking for) was the difference between using a standard `Dictionary` and handling the locking yourself vs using the `ConcurrentDictionary` type that is built into .NET 4+. I'm actually a little bit baffled that this was accepted. – mclark1129 Apr 29 '13 at 18:35
  • 4
    Thread-safety is more than just using the right collection. Using the right collection is a start, not the only thing you have to deal with. I guess that is what the OP wanted to know. I can't guess why this was accepted, it's been a while since I wrote this. – Lasse V. Karlsen Apr 29 '13 at 19:16
  • 3
    I think what @LasseV.Karlsen is trying to say, is that...thread safety is easy to achieve, but you must supply a level of concurrency in how you handle that safety. Ask yourself this, "Can I do more operations at the same time while keeping the object thread-safe?" Consider this example: Two customers want to return different items. When you, as the manager, make the trip to go restock your store, can you not just restock both items in one trip and still handle both customers' requests, or are you going to make a trip each time a customer returns an item? – Alexandru Feb 06 '14 at 18:33
83

You still need to be very careful when using thread-safe collections because thread-safe doesn't mean you can ignore all threading issues. When a collection advertises itself as thread-safe, it usually means that it remains in a consistent state even when multiple threads are reading and writing simultaneously. But that does not mean that a single thread will see a "logical" sequence of results if it calls multiple methods.

For example, if you first check if a key exists and then later get the value that corresponds to the key, that key may no longer exist even with a ConcurrentDictionary version (because another thread could have removed the key). You still need to use locking in this case (or better: combine the two calls by using TryGetValue).

So do use them, but don't think that it gives you a free pass to ignore all concurrency issues. You still need to be careful.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • 4
    So basically you are saying if you don't use the object correctly it won't work? – ChaosPandion Dec 22 '09 at 21:12
  • Why is that? The collection loses the change, somehow? – Bruno Brant Dec 22 '09 at 21:13
  • 4
    Basically you need to use TryAdd instead of the common Contains -> Add. – ChaosPandion Dec 22 '09 at 21:14
  • 3
    @Bruno: No. If you haven't locked, another thread may have removed the key between the two calls. – Mark Byers Dec 22 '09 at 21:15
  • 13
    Don't mean to sound curt here, but `TryGetValue` has always been part of `Dictionary` and has always been the recommended approach. The important "concurrent" methods that `ConcurrentDictionary` introduces are `AddOrUpdate` and `GetOrAdd`. So, good answer, but could have picked a better example. – Aaronaught Dec 27 '09 at 17:01
  • 4
    Right - unlike in database which has transaction, most concurrent in-memory search structure miss concept of *isolation*, they only guarantee the internal data structure is correct. – Chang May 08 '12 at 06:13
  • @all I am adding key to concurrent dictionary and there is no point of removing the keys at all. So I am using `tryGet` and `tryAdd` ? Is this may cause any issue in future ? – kbvishnu Oct 23 '12 at 14:16
  • would have been nice any example – T.Todua Jun 04 '20 at 09:12
  • "But that does not mean that a single thread will see a "logical" sequence of results if it calls multiple methods" +1 – Jim Aho Sep 12 '22 at 06:33
48

Internally ConcurrentDictionary uses a separate lock for each hash bucket. As long as you use only Add/TryGetValue and the like methods that work on single entries, the dictionary will work as an almost lock-free data structure with the respective sweet performance benefit. OTOH the enumeration methods (including the Count property) lock all buckets at once and are therefore worse than a synchronized Dictionary, performance-wise.

I'd say, just use ConcurrentDictionary.

Stefan Dragnev
  • 14,143
  • 6
  • 48
  • 52
20

I think that ConcurrentDictionary.GetOrAdd method is exactly what most multi-threaded scenarios need.

Konstantin
  • 3,817
  • 4
  • 29
  • 39
  • 1
    I would use TryGet also, otherwise you waste the effort in initializing the second parameter to GetOrAdd every time the key already exists. – Jorrit Schippers Dec 10 '13 at 10:21
  • 9
    *GetOrAdd* has the overload *GetOrAdd(TKey, Func)*, so the second parameter can be lazy-initialized only in case of the key do not exist in the dictionary. Besides *TryGetValue* is used to fetching data, not modifying. Subsequent calls to each of these methods may cause that another thread may have added or removed the key between the two calls. – sgnsajgon Oct 10 '14 at 20:28
  • This should be the marked answer to the question, even though Lasse's post is excellent. – Spivonious Oct 26 '16 at 16:03
14

Have you seen the Reactive Extensions for .Net 3.5sp1. According to Jon Skeet, they have backported a bundle of the parallel extensions and concurrent data structures for .Net3.5 sp1.

There is a set of samples for .Net 4 Beta 2, which describes in pretty good detail on how to use them the parallel extensions.

I've just spent the last week testing the ConcurrentDictionary using 32 threads to perform I/O. It seems to work as advertised, which would indicate a tremendous amount of testing has been put into it.

Edit: .NET 4 ConcurrentDictionary and patterns.

Microsoft have released a pdf called Patterns of Paralell Programming. Its reallly worth downloading as it described in really nice details the right patterns to use for .Net 4 Concurrent extensions and the anti patterns to avoid. Here it is.

Richard
  • 6,215
  • 4
  • 33
  • 48
scope_creep
  • 4,213
  • 11
  • 35
  • 64
6

Basically you want to go with the new ConcurrentDictionary. Right out of the box you have to write less code to make thread safe programs.

ChaosPandion
  • 77,506
  • 18
  • 119
  • 157
-2

We've used ConcurrentDictionary for cached collection, that is re-populated every 1 hour and then read by multiple client threads, similar to the solution for the Is this example thread safe? question.

We found, that changing it to ReadOnlyDictionary improved overall performance.

Community
  • 1
  • 1
Michael Freidgeim
  • 26,542
  • 16
  • 152
  • 170