3

I understand the overall concepts of multi-threading and synchronization but am new to writing thread-safe code. I currently have the following code snippet:

synchronized(compiledStylesheets) {
    if(compiledStylesheets.containsKey(xslt)) {
        exec = compiledStylesheets.get(xslt);
    } else {
        exec = compile(s, imports);
        compiledStylesheets.put(xslt, exec);
    }
}

where compiledStylesheets is a HashMap (private, final). I have a few questions.

The compile method can take a few hundred milliseconds to return. This seems like a long time to have the object locked, but I don't see an alternative. Also, it is unnecessary to use Collections.synchronizedMap in addition to the synchronized block, correct? This is the only code that hits this object other than initialization/instantiation.

Alternatively, I know of the existence of a ConcurrentHashMap but I don't know if that's overkill. The putIfAbsent() method will not be usable in this instance because it doesn't allow me to skip the compile() method call. I also don't know if it will solve the "modified after containsKey() but before put()" problem, or if that's even really a concern in this case.

Edit: Spelling

Alexey Malev
  • 6,408
  • 4
  • 34
  • 52
Derek
  • 1,196
  • 11
  • 32

5 Answers5

3

I think you are looking for a Multiton.

There's a very good Java one here that @assylas posted some time ago.

Community
  • 1
  • 1
OldCurmudgeon
  • 64,482
  • 16
  • 119
  • 213
  • The linked answer uses ConcurrentHashMap in two different ways, depending on whether JDK8 is being used. Both appear very clean, high-performing, and race-condition-free. – Peter G Apr 21 '14 at 18:53
  • Yes, this is the solution I came up with, but I used a `CountDownLatch` instead of a `FutureTask`. Something like this is the way to go. – erickson Apr 21 '14 at 19:39
  • @erickson - I like the elegance of the `FutureTask`. – OldCurmudgeon Apr 21 '14 at 20:03
  • Yes, `FutureTask` is good, I had just forgotten that they did what I needed. – erickson Apr 21 '14 at 20:29
3

For tasks of this nature, I highly recommend Guava caching support.

If you can't use that library, here is a compact implementation of a Multiton. Use of the FutureTask was a tip from assylias, here, via OldCurmudgeon.

public abstract class Cache<K, V>
{

  private final ConcurrentMap<K, Future<V>> cache = new ConcurrentHashMap<>();

  public final V get(K key)
    throws InterruptedException, ExecutionException
  {
    Future<V> ref = cache.get(key);
    if (ref == null) {
      FutureTask<V> task = new FutureTask<>(new Factory(key));
      ref = cache.putIfAbsent(key, task);
      if (ref == null) {
        task.run();
        ref = task;
      }
    }
    return ref.get();
  }

  protected abstract V create(K key)
    throws Exception;

  private final class Factory
    implements Callable<V>
  {

    private final K key;

    Factory(K key)
    {
      this.key = key;
    }

    @Override
    public V call()
      throws Exception
    {
      return create(key);
    }

  }

}
Community
  • 1
  • 1
erickson
  • 265,237
  • 58
  • 395
  • 493
2

You can loosen the lock at the risk of an occasional doubly compiled stylesheet in race condition.

Object y;

// lock here if needed
y = map.get(x);
if(y == null) {
    y = compileNewY();

    // lock here if needed
    map.put(x, y); // this may happen twice, if put is t.s. one will be ignored
    y = map.get(x); // essential because other thread's y may have been put
}

This requires get and put to be atomic, which is true in the case of ConcurrentHashMap and you can achieve by wrapping individual calls to get and put with a lock in your class. (As I tried to explain with "lock here if needed" comments - the point being you only need to wrap individual calls, not have one big lock).

This is a standard thread safe pattern to use even with ConcurrentHashMap (and putIfAbsent) to minimize the cost of compiling twice. It still needs to be acceptable to compile twice sometimes, but it should be okay even if expensive.

By the way, you can solve that problem. Usually the above pattern isn't used with a heavy function like compileNewY but a lightweight constructor new Y(). e.g. do this:

class PrecompiledY {
    public volatile Y y;
    private final AtomicBoolean compiled = new AtomicBoolean(false);
    public void compile() {
        if(!compiled.getAndSet(true)) {
            y = compile();
        }
    }
}
// ...
ConcurrentMap<X, PrecompiledY> myMap; // alternatively use proper locking

py = map.get(x);
if(py == null) {
    py = new PrecompiledY(); // much cheaper than compiling

    map.put(x, y); // this may happen twice, if put is t.s. one will be ignored
    y = map.get(x); // essential because other thread's y may have been put
    y.compile(); // object that didn't get inserted never gets compiled
}

Also:

Alternatively, I know of the existence of a ConcurrentHashMap but I don't know if that's overkill.

Given that your code is heavily locking, ConcurrentHashMap is almost certainly far faster, so not overkill. (And much more likely to be bug-free. Concurrency bugs are not fun to fix.)

djechlin
  • 59,258
  • 35
  • 162
  • 290
  • I am unable to understand parts of this. Can you please clarify "lock here if needed" comments? When are locks needed? Also, does this code not allow compile by multiple threads? That would be resource-waste, no? – Miserable Variable Apr 21 '14 at 17:59
  • @MiserableVariable As long as `get` and `put` are threadsafe, this algorithm is threadsafe. It does not need any locking across accesses to `map`. – djechlin Apr 21 '14 at 18:05
  • @MiserableVariable see my edit for how to solve the problem of wasting a compile. – djechlin Apr 21 '14 at 18:09
  • 1
    This will break if the map is not a `ConcurrentMap` (or something that uses memory barriers in a similar way). – erickson Apr 21 '14 at 18:14
  • @erickson I tried to explain poorly in the answer and in the comments that the algo only assumes `put` and `get` are atomic, which can be achieved by wrapping them in locks or using a common `ConcurrentMap` implementation. Is there another problem? – djechlin Apr 21 '14 at 18:16
  • I am talking about the second example. It doesn't explain that `map` must be a `ConcurrentMap`. – erickson Apr 21 '14 at 18:56
1

Please see Erickson's comment below. Using double-checked locking with Hashmaps is not very smart

The compile method can take a few hundred milliseconds to return. This seems like a long time to have the object locked, but I don't see an alternative.

You can use double-checked locking, and note that you don't need any lock before get since you never remove anything from the map.

if(compiledStylesheets.containsKey(xslt)) {
    exec = compiledStylesheets.get(xslt);
} else {
    synchronized(compiledStylesheets) {
        if(compiledStylesheets.containsKey(xslt)) {
            // another thread might have created it while
            // this thread was waiting for lock
            exec = compiledStylesheets.get(xslt);
        } else {
            exec = compile(s, imports);
            compiledStylesheets.put(xslt, exec);
        }
    }
}

}

Also, it is unnecessary to use Collections.synchronizedMap in addition to the synchronized block, correct?

Correct

This is the only code that hits this object other than initialization/instantiation.

Community
  • 1
  • 1
Miserable Variable
  • 28,432
  • 15
  • 72
  • 133
  • I was hesitant to call containsKey twice but I suppose it's pretty fast on a HashMap... – Derek Apr 21 '14 at 17:53
  • 2
    This is bad. You can't extend the thread-safe double-checked locking idiom to a `HashMap`. Simply invoking `get()` on a `HashMap` that might be modified asynchronously can cause bugs when modifications due to rehashing become visible non-atomically or out of order. – erickson Apr 21 '14 at 18:04
  • It is probably better to call it twice than to lock it; this is a write-rarely,read-frequently scenario. – Miserable Variable Apr 21 '14 at 18:05
  • @erickson I had not considered rehashing. Now I am (again) wondering about the value of double-checked locking, having myself used it mostly for managing such maps. Do you know if the unlocked `get` can throw exception or return wrong results? Since the second `get` is guarded, I am wondering if it is safe (even if ugly) to ignore the error in the first call. – Miserable Variable Apr 21 '14 at 18:15
  • 1
    Whether the unguarded `get()` can throw exceptions is implementation-specific, and the conditions that lead to problems have varied over the years. In some versions, it was reported that an infinite loop could occur, so you wouldn't even get an exception to ignore. There are better solutions that are efficient and guaranteed to be correct by the Java memory model. – erickson Apr 21 '14 at 19:47
0

First of all, the code as you posted it is race-condition-free because containsKey() result will never change while compile() method is running.

Collections.synchronizedMap() is useless for your case as stated above because it wraps all map methods into a synchronized block using either this as a mutex or another object you provided (for two-argument version).

IMO using ConcurrentHashMap is also not an option because it stripes locks based on key hashCode() result; its concurrent iterators is also useless here.

If you really want compile() out of synchronized block, you may pre-calculate if before checking containsKey(). This may draw the overall performance back, but may be better than calling it in synchronized block. To make a decision, personally I would consider how often key "miss" is happening and so, which option is preferrable - keep the lock for longer times or calculate your stuff always.

Alexey Malev
  • 6,408
  • 4
  • 34
  • 52