3

I'm trying to do a thread safe getter of a map value, which also creates the value if it does not exist:

private final static Map<String, MyObject> myMap = new HashMap<String, MyObject>();

public MyObject myGetter(String key) {
    // Try to retrieve the object without synchronization
    MyObject myObject = myMap.get(key);
    if (myObject != null) {
        return myObject;
    }

    // The value does not exist, so create a new object.
    synchronized (myMap) {
        myObject = myMap.get(key);
        if (myObject == null) {
            myObject = new MyObject(key);
            myMap.put(key, myObject);
        }
        return myObject;
    }
}

Note that the myMap member variable is final, and it is not accessed anywhere else.

I do not want the MyObject to be created if not needed (some other patterns I have found suggest a pattern that may result in multiple creations for the same key).

I'm aware of the 'double check' anti-pattern, but I'm not sure if this code is applicable on that anti-pattern since it is not the member variable itself that is created here.

So, my question is if this still is a case of that anti-pattern, and if so: why?

Edit 1:

Judging from comments, one fix here would be to simply accept the 'performance impact' (a minor impact I guess), and just include the read in the synchronized block, like this:

private final static Map<String, MyObject> myMap = new HashMap<String, MyObject>();

public MyObject myGetter(String key) {
    synchronized (myMap) {
        MyObject myObject = myMap.get(key);
        if (myObject == null) {
            // The value does not exist, create a new object.
            myObject = new MyObject(key);
            myMap.put(key, myObject);
        }
        return myObject;
    }
}

Also, I'm on Java 6, so the ConcurrentHashMap.computeIfAbsent is not an option here.

Supun Wijerathne
  • 11,964
  • 10
  • 61
  • 87
User0
  • 564
  • 3
  • 10
  • 1
    You don't synchronize the read operation so you may observe a non null reference pointing to an inconsistent MyObject instance. Is there a reason why you don't simply use a ConcurrentHashMap::computeIfAbsent? – assylias Oct 18 '16 at 10:38
  • Not only is this an instance of the antipattern, the standard fix for it isn't enough to fix this one. If you take time to understand why the original idiom is broken, you'll have your "why" question answered as well. – Marko Topolnik Oct 18 '16 at 10:39
  • if myObject is not null the code will never reach the synchronized block, these lines are unnecessary ? `myObject = myMap.get(key); if (myObject == null) {` – Ömer Erden Oct 18 '16 at 10:41
  • @Bolzano the map may have been populated by another thread between the first and second null check. – assylias Oct 18 '16 at 10:42
  • @assylias Thanks, you are of course right. My bad. Yes, the reason I'm not using ConcurrentHashMap::computeIfAbsent is that I'm on Java 6 (don't ask why...). :) – User0 Oct 18 '16 at 10:42
  • @assylias oh, same key can be used by different threads you are right . – Ömer Erden Oct 18 '16 at 10:44
  • 1
    @User0 What you are doing looks like a multiton pattern - there are a few thread safe examples [here](http://stackoverflow.com/questions/11126866/thread-safe-multitons-in-java) - some of them use Java 8, some of them would work on Java 6. – assylias Oct 18 '16 at 11:01
  • For HashMap there is no way to avoid locking if you want thread safety. Only ConcurrentHashMap would work, but you are locking internally anyway and not saving anything. – Peter Lawrey Oct 18 '16 at 11:08
  • @assylias: it’s worse than having the possibility of encountering an inconsistent `MyObject` instance. The `HashMap` instance may be inconsistent as well during the `get` operation due to a concurrent `put` operation. One possible outcome is the infamous looping forever state while traversing linked `Entry` nodes. – Holger Oct 18 '16 at 13:35
  • @Holger IIRC that could only happen if multiple threads are mutating the map, not if there are multiple reads and one mutator – John Vint Oct 18 '16 at 14:05
  • @John Vint: I don’t know whether this practically observed behavior was limited to such scenarios, but the issue remains, there are no guarantees regarding the behavior of `Map.get` if there can be concurrent `put` operations. – Holger Oct 18 '16 at 14:20
  • Agreed, I was only commenting on the worst case scenario of an infinite loop. – John Vint Oct 18 '16 at 14:22
  • @John Vint: I wouldn’t call the infinite loop the worst case. The worst case is continuing the work with inconsistent data, silently producing wrong follow-up values, unnoticed for a long time. – Holger Oct 18 '16 at 14:24
  • @Holger I guess we can disagree on that then. – John Vint Oct 18 '16 at 14:27

2 Answers2

3

As the comments and your edit indicates, this is a form of the double-checked-locking anti-pattern.

The correct implementation is to use the ConcurrentHashMap. You listed in your comments that you are using Java 6 and cannot then use computeIfAbsent. That's fine, you should still use the ConcurrentHashMap#putIfAbsent.

You can do a double-check like operation

private final static ConcurrentMap<String, MyObject> myMap =
       new ConcurrentHashMap<String, MyObject>();

public MyObject myGetter(String key) {
   MyObject ref = myMap.get(key);
   if(ref == null) {
      ref = new MyObject(key);
      MyOject put = myMap.putIfAbsent(key, ref);
      if(put != null) {
         // some other thread won put first
         ref = put;
      }
   }
   return ref;
}

In this case you are only locking on mutation since CHM holds the property of non-blocking reads.

Otherwise, the edit you supplied is a thread-safe (albeit inefficient) implementation.

Gray
  • 115,027
  • 24
  • 293
  • 354
John Vint
  • 39,695
  • 7
  • 78
  • 108
  • 2
    The caveat being that you may create more than one instance of MyObject for a given key (which may not be an issue). I suggest a different lock-free approach to prevent that: http://stackoverflow.com/a/18149547/829571 – assylias Oct 18 '16 at 15:45
  • 1
    @assylias, I do like that. – John Vint Oct 18 '16 at 15:47
3

I do not want the MyObject to be created if not needed (some other patterns I have found suggest a pattern that may result in multiple creations for the same key).

This is a very common pattern but unfortunately your code suffers race conditions because of it. Yes, it is an example of double-check locking.

You code is doing approximately:

// unsynchronized read of the map
// if value exists then return

// synchronize on map
// add value to map
// leave synchronize block

The problem with this is that the unsynchronized read of a shared map must also be properly memory synchronized. If thread #1 alters the map there is no guarantee that thread #2 would see those updates. Even worse is the fact that you might get partial memory updates which might corrupt the map in thread #2's memory causing infinite loops or other undefined behavior.

As @JohnVint and others have recommended the ConcurrentHashMap is the right thing to do when trying to share a map between multiple threads. It takes care of the memory synchronization (and locking where necessary) for you in the most efficient manner.

Gray
  • 115,027
  • 24
  • 293
  • 354