10

Suppose we have multiple threads and we're dividing the possible keySet between the threads (i.e. key % thread_i) so there's no key collision.

Can we safely use HashMap<T> instead of ConcurrentHashMap<T>?

IsaacLevon
  • 2,260
  • 4
  • 41
  • 83
  • 5
    I wouldn't count on it, a more in-depth discussion: https://dzone.com/articles/concurrency-and-hashmap, notably this remark: "do not over engineer when Concurreny is involved. Because none of the theories will stand the test of time in a concurrent environment. As a rule of thumb use Thread safe collections wherever concurrency is involved." – Bart Friederichs Nov 22 '18 at 13:01
  • 1
    You probably want a `ThreadLocal>` to have a per-thread map. But this *is* an interesting question nonetheless. – Petr Janeček Nov 22 '18 at 13:02
  • But how this map is going to be read if these threads' caches won't be flushed? Reader is going to read only partial state, isn't he? – nyarian Nov 22 '18 at 13:08

3 Answers3

8

No, for several reasons. Some (but not all) would be:

  1. HashMap rehashes your hash, so you won't even know what the actual key hash is.
  2. You'd have to look in HashMap's implementation to even know how many buckets there are.
  3. Resizing would have to copy state from the old array, and that wouldn't be safely published across threads.
  4. Other state, like the map's current size, wouldn't be safely updated.

If you constructed your map with a particular jvm implementation in mind, and made sure that your map never resized, and knew you'd never care about that extra state, it's maybe possible, in the strictest of senses. But for any practical purposes, no.

yshavit
  • 42,327
  • 7
  • 87
  • 124
4

The Javadoc is very clear about how you should use HashMap in multithreaded applications (I didn't even add the emphasis on "must" myself):

If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally.

It's not beating about the bush here. You can use it safely provided you synchronize appropriately (which basically means to ensure mutually exclusive access to the map across threads).

If you do anything other than what it directly tells you, you are on your own.

Andy Turner
  • 137,514
  • 11
  • 162
  • 243
2

Maybe it will work, if you pre-populate the map with dummy values for each keys, before starting the different threads. But:

  • it may depends on implementations
  • you always run the risk of someone later modifying the code to read/write data that is in the scope of a different thread.
  • if a new key comes, that wasn't pre-populated, you're running the risk of some undefined behavior.

A better option is to use a local map for each thread and join the local map at the end to collect the results.

0x26res
  • 11,925
  • 11
  • 54
  • 108