0

I wonder if it is thread-safe to use a concurrent HashSet in Java in which concurrent writes from more than 1 threads can occur if we can guarantee that each thread can add()/remove() only a unique value

for example, a thread can add()/remove() to/from a the HashSet its own id)

I have read that in any case when there are concurrent writes in a HashSet, it is not thread-safe but I do not understand why is that true in my previous example. Since each thread can add()/remove() only a unique value (like threadID) which is not going to be added or removed from another thread and as a result there are no dependencies between the threads insertions and deletions, why is not that thread-safe?

Finally, is there any more intelligent way to implement this instead of creating a ConcurrentHashMap which contains keys with empty values?

EDIT: I have read this post but my question still is not answered. I do not understand why the HashSet can be affected by concurrent writes of different values.

Community
  • 1
  • 1
dinosaur
  • 59
  • 1
  • 8
  • Maybe you can use the info [here](http://stackoverflow.com/questions/6992608/why-there-is-no-concurrenthashset-against-concurrenthashmap). – dquijada Apr 15 '16 at 08:28
  • Please read carefully my question. – dinosaur Apr 15 '16 at 08:34
  • Why? Because [the documentation](https://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html) says so. It might be the case that concurrent writes *don't* cause problems, but there is no guarantee of that behaviour, so it should not be relied upon. – Andy Turner Apr 15 '16 at 08:36
  • This would be clearer if you posted the important parts of the code you are talking about. – Raedwald Apr 15 '16 at 08:39
  • If you do not understand why HashSet is not thread safe, you do not understand the basics of thread safety. Any useful answer to this question would amount to a tutorial. – Raedwald Apr 15 '16 at 08:42
  • I thought I had explained well the above example. Any thread can execute only `set.add(ownId)` or `set.remove(ownId)` – dinosaur Apr 15 '16 at 08:42
  • See http://stackoverflow.com/questions/29249714/when-is-copyonwritearrayset-useful-to-achieve-thread-safe-hashset – Raedwald Apr 15 '16 at 08:43
  • What is wrong with using a ConcurrentHashMap? – assylias Apr 15 '16 at 08:51
  • I did not say that it is wrong at all. I just don't like the idea of using a `HashMap` of empty values instead of a `HashSet` (because `ConcurrentHashSet` is not provided) – dinosaur Apr 15 '16 at 08:56

1 Answers1

1

HashSet, and HashMap, use a hashing mechanism to determine which bucket to put the new entry in. Generally, this will not be affected by parallel access so long as the keys are distinct.

Bucket sizes are carefully observed because if they start getting large, performance takes a significant hit. When a HashSet's bucket gets too full a resize is triggered. This rearranges all of the buckets into a better arrangement.

If another thread attempts to write/read during a resize is is very likely that the process will crash as the structure is in the midst of being rearranged.

If normal synchronisation mechanisms seem too onerous you may be better to use a ReadWriteLock. See here for some usage hints.

Community
  • 1
  • 1
OldCurmudgeon
  • 64,482
  • 16
  • 119
  • 213