3

I'm reading the Java Doc about the WeakHashMap and I get the basic concept. Because of the GC thread(s) acting in the background, you can get 'unusual behavior', such as a ConcurrentModificationException when iterating and etc.

The thing I don't get is that if the default implementation is not synchronized and does not contain lock in any way, then how come there is no possibility of getting an inconsistent state. Say you have 2 threads. A GC thread deleting some key at a certain index and at same time and at the same index, a user thread is inserting in the array a key value pair.

To me, if there is no synchronization, then there is a high risk of getting a hash map that is inconsistent.

Even worse, doing something like this might actually be super dangerous because v might actually be null.

if (map.contains(k)) {
   V v = map.get(k)
}

Am I missing something?

Micha Wiedenmann
  • 19,979
  • 21
  • 92
  • 137
jeremie
  • 971
  • 9
  • 19
  • 1
    Possible duplicate of [WeakHashMap iteration and garbage collection](https://stackoverflow.com/questions/2861410/weakhashmap-iteration-and-garbage-collection) – tsolakp Jan 23 '18 at 21:42

4 Answers4

5

The inconsistent state issues you mention do not arise because the GC does not actively restructure WeakHashMaps. When the garbage collector frees the referent of a weak reference, the corresponding entry is not physically removed from the map; the entry merely becomes stale, with no key. At some later point, the entry may be physically removed during some other operation on the map, but the GC won't take on that responsibility.

You can see one Java version's implementation of this design on grepcode.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • @tsolakp: Whatever thread performs the operation that triggers the removal of the stale entry. For example, if thread A calls size() and size() purges stale entries to get an up-to-date size, then thread A will perform the purge. – user2357112 Jan 23 '18 at 21:56
  • Thanks. That is exactly what I wanted to hear. It is not that GC thread that removes the key but the actual calling thread that does it. Maybe you could update your answer with your last comment. – tsolakp Jan 23 '18 at 22:01
0

What you're describing is what the documentation explicitly states:

Because the garbage collector may discard keys at any time, a WeakHashMap may behave as though an unknown thread is silently removing entries.

The only mistake you're making is the assumption that you can protect the state by synchronizing. That doesn't work because the synchronization would not be mutual on the part of the GC. To quote the documentation:

In particular, even if you synchronize on a WeakHashMap instance and invoke none of its mutator methods, it is possible for the size method to return smaller values over time, for the isEmpty method to return false and then true, for the containsKey method to return true and later false for a given key, for the get method to return a value for a given key but later return null, for the put method to return null and the remove method to return false for a key that previously appeared to be in the map, and for successive examinations of the key set, the value collection, and the entry set to yield successively smaller numbers of elements.

shmosel
  • 49,289
  • 6
  • 73
  • 138
  • I think the question was as to how it is made sure we don't get into inconsistent state when GC thread removes "k1" key and the current thread gets value for "k1" at the same time, all this without synchronization. – tsolakp Jan 23 '18 at 21:50
  • Inconsistent in what sense? – shmosel Jan 23 '18 at 21:51
  • For example still getting value for "k1" after GC has removed it from the Map. – tsolakp Jan 23 '18 at 21:53
  • Why would that happen? – shmosel Jan 23 '18 at 22:02
  • Inconsistent in the sense that if you have conflicts in your map, the GC reclaims k1, a user insert k2, both keys are conflicting, then if there is no synchronization, k2 might very well never be inserted, or inserted but not reachable – jeremie Jan 23 '18 at 22:03
  • Maybe because that removed state wont be visible to calling thread because there was no synchronization involved. – tsolakp Jan 23 '18 at 22:03
  • Then you're really asking about the visibility implications of `WeakReference` in general. I don't know for sure, but I suspect that's handled internally by the GC. – shmosel Jan 23 '18 at 22:07
0

This class is intended primarily for use with key objects whose equals methods test for object identity using the == operator. Once such a key is discarded it can never be recreated, so it is impossible to do a lookup of that key in a WeakHashMap at some later time and be surprised that its entry has been removed.

So if one uses WeakHashMap for objects whose equals() is based on identity check, all is fine. The first case you mentioned ("A GC thread deleting some key at a certain index and at same time and at the same index, a user thread is inserting in the array a key value pair.") is impossible because as long as the user thread keeps a reference to the key object it cannot be discarded by GC.

And the same stands for the second example:

if (map.contains(k)) {
   V v = map.get(k)
}

You keep reference k so the corresponding object is reachable and cannot be discarded.

But

This class will work perfectly well with key objects whose equals methods are not based upon object identity, such as String instances. With such recreatable key objects, however, the automatic removal of WeakHashMap entries whose keys have been discarded may prove to be confusing.

Igor Melnichenko
  • 134
  • 2
  • 13
  • The `k` is not necessary the same reference as the one that was used as key in `put` method. – tsolakp Jan 23 '18 at 21:52
  • Then `map.contains(k)` will return `false` and all will be fine :) – Igor Melnichenko Jan 23 '18 at 21:55
  • For the first case, if you have conflicts in your hash map, then the GC might be trying to delete a key k1, a user inserting a key k2 all happening at the same index. – jeremie Jan 23 '18 at 21:57
  • 1
    We can have two keys `k1` and `k2` that are `k1 != k2` that still can produce `true` for `map.contains(k1)` and `map.contains(k2)` after only `map.put(k1, v)`. – tsolakp Jan 23 '18 at 21:59
  • @tsolakp ah, my bad, I should've added that my answer applies only for objects whose `equals()` is based on identity check. Edited the answer, thank you – Igor Melnichenko Jan 23 '18 at 22:04
0

Referring to

even if you synchronize on a WeakHashMap [...] it is possible for the size method to return smaller values over time

the javadoc sufficiently explains to me that there is a possibility for an inconsistent state and that it is completely independent from synchronization.

A few examples later, the given example is referred to, too:

for the containsKey method to return true and later false for a given key

So basically, one should never rely on the state of a WeakHashMap. but use it as atomic as possible. The given example should therefore be rephrased to

V v = map.get(k);
if(null != v) {
}

or

Optional.ofNullable(map.get(k)).ifPresent(() -> { } );
Izruo
  • 2,246
  • 1
  • 11
  • 23