1

I want to store a collection (hashset) as the value in a concurrent dictionary. The code I have works fine as such when testing in isolation.

The question I have though when I retrieve the collection value from the concurrent dictionary should I use a lock when modifying/reading the collection value to avoid another thread updating the collection after I have read the collection from the current dictionary.

The system is a web app with a high degree of concurrency and there will be many threads attempting to update the collection stored in the concurrent dictionary. So want to avoid to any sort of race conditions in terms of reading/modifying the collection value I store in the concurrent dictionary.

ConcurrentDictionary<Guid, HashSet<string>> dictionary = new ConcurrentDictionary<Guid, HashSet<string>>();

 userSession.Add(newSessionId);
   // userSession.Remove(newSessionId)

vs.

ConcurrentDictionary<Guid, HashSet<string>> dictionary = new ConcurrentDictionary<Guid, HashSet<string>>();

var userSessions = new HashSet<string>();
dictionary.TryGetValue(accountId,out userSessions);

lock(userSession)
{
   userSession.Add(newSessionId);
   // userSession.Remove(newSessionId)
}

user805703
  • 207
  • 6
  • 14
  • `HashSet` is not thread-safe. So yes, you need some form of locking (or see the duplicate for a way to use `ConcurrentDictionary` rather than `HashSet`). – mjwills Oct 15 '19 at 00:07
  • Its not quite a duplicate question as I'm using a thread safe collection for the concurrent dictionary combined with the hashset. In term of locking is it sufficient to lock on the hashset as i have done in the sample code. And in some cases I might need to do lots of modifications (read every value) so is it still ok to lock on the collection or is it better to lock on the statement? Its a pity there's no threadsafe implementation for some similar set like collection structure – user805703 Oct 15 '19 at 01:10
  • How often do you expect that two threads will access concurrently the same `HashSet`? If it's not too often, then you could protect the access to each `HashSet` by using a separate "locker" object for each `HashSet`. The simplest way to implement it could be to use each `HashSet` as its own locker. This approach could be lighter than using nested `ConcurrentDictionary`s. – Theodor Zoulias Oct 15 '19 at 01:18
  • the chances are it will be quite often that two threads (sometime much more) access the concurrent dictionary for the same account to retrieve the hashset. And the operations performed on the hashsets will vary ( from single ops to multiple operations) so thinking locking on the hashset like i have in the sample code might be enough? – user805703 Oct 15 '19 at 01:26
  • 1
    `In term of locking is it sufficient to lock on the hashset as i have done in the sample code` Yes it is (assuming you fix the typo in your code). Note that you will need the same `lock` around **all** usages of the hashset - not just that one. – mjwills Oct 15 '19 at 02:38
  • `Its not quite a duplicate question as I'm using a thread safe collection for the concurrent dictionary combined with the hashset.` The thread safe collection is a red herring. It matters not where the hashset 'came from' (the hashset doesn't magically change its thread-safety by being stored inside a concurrent dictionary). It isn't thread-safe - so you need a lock on it, as per the duplicates. – mjwills Oct 15 '19 at 02:39
  • BTW, for your concurrent dictionary usage, I'd suggest reading up on `GetOrAdd` rather than use `TryGetValue`. Your code as is (even if you fix the `userSession` typo) will not do anything useful if `TryGetValue` returns `false` (i.e. the `HashSet` isn't being added). – mjwills Oct 15 '19 at 02:42
  • ```Note that you will need the same lock around all usages of the hashset - not just that one``` by this do you mean if i lock on the particular hashset i need to use anywhere in the code will suffice? Or do you mean i need to create an explicit dedicated shared lock for accessing the hashset. Note there will be many accounts and each will have its own hashset, but many threads (users) can be updating this hashset in lots of places in the code (slight exaggeration...but i do have that use case) – user805703 Oct 15 '19 at 13:23
  • My concern is that the nature of your question suggests you are not 100% across the finer details of locking, particularly in regards to `HashSet`. As such, although this will seem annoying to you (and I apologise), I am not going to clarify further regarding your most recent comment and instead **strongly** suggest you use `ConcurrentDictionary` rather than `HashSet` (as per the duplicate). Then your comment becomes a moot point, you don't need to use `lock` and everything will 'just work'. – mjwills Oct 15 '19 at 20:33
  • thanks for the help. Not sure i like using a ConcurrentDictionary with a dummy value. But do agree your right its better not to roll my own solution for this. In fact think this library https://github.com/i3arnon/ConcurrentHashSet is exactly what I want so plan to use ```ConcurrentDictionary> ``` that way i'm hoping to avoid all threading concerns when a user uses the the underlying ```ConcurrentHashSet``` across the code base – user805703 Oct 16 '19 at 09:25
  • You can get away with a `ConcurrentHashSet` (or `ConcurrentDictionary` with dummy values) as long as your needs for atomicity are limited to CRUD operations. If you need to do anything more complex as an atomic operation, you have no other option than to lock manually. For example if you want to remove an item from the hashset, but only if a condition is satisfied, you can't ensure the atomicity of this operation with the aforementioned containers. The only guarantee they offer is that their internal state will not be corrupted. They know nothing about the rest of your program's state. – Theodor Zoulias Oct 17 '19 at 23:45

0 Answers0