37

Possible Duplicate:
What is the best way to increase number of locks in java

Suppose I want to lock based on an integer id value. In this case, there's a function that pulls a value from a cache and does a fairly expensive retrieve/store into the cache if the value isn't there.

The existing code isn't synchronized and could potentially trigger multiple retrieve/store operations:

//psuedocode
public Page getPage (Integer id){
   Page p = cache.get(id);
   if (p==null)
   {
      p=getFromDataBase(id);
      cache.store(p);
   }
}

What I'd like to do is synchronize the retrieve on the id, e.g.

   if (p==null)
   {
       synchronized (id)
       {
        ..retrieve, store
       }
   }

Unfortunately this won't work because 2 separate calls can have the same Integer id value but a different Integer object, so they won't share the lock, and no synchronization will happen.

Is there a simple way of insuring that you have the same Integer instance? For example, will this work:

 syncrhonized (Integer.valueOf(id.intValue())){

The javadoc for Integer.valueOf() seems to imply that you're likely to get the same instance, but that doesn't look like a guarantee:

Returns a Integer instance representing the specified int value. If a new Integer instance is not required, this method should generally be used in preference to the constructor Integer(int), as this method is likely to yield significantly better space and time performance by caching frequently requested values.

So, any suggestions on how to get an Integer instance that's guaranteed to be the same, other than the more elaborate solutions like keeping a WeakHashMap of Lock objects keyed to the int? (nothing wrong with that, it just seems like there must be an obvious one-liner than I'm missing).

Community
  • 1
  • 1
Steve B.
  • 55,454
  • 12
  • 93
  • 132
  • 4
    This "duplicate" question and Eddie's answer below are better than the "original" question! – arun Oct 01 '14 at 19:29
  • See also http://stackoverflow.com/questions/6616141/java-threads-locking-on-a-specific-object specifically http://stackoverflow.com/a/27806218/32453 – rogerdpack Jan 06 '15 at 19:59
  • What about using [AtomicInteger](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/AtomicInteger.html)? – GregT Jan 04 '18 at 12:14
  • FYI, Brian Goetz mentions that one should never lock on `Integer` in his Devoxx 2018 talk, [at 54:49](https://youtu.be/4r2Wg-TY7gU?t=3289). – Basil Bourque Jan 01 '19 at 05:37

9 Answers9

57

You really don't want to synchronize on an Integer, since you don't have control over what instances are the same and what instances are different. Java just doesn't provide such a facility (unless you're using Integers in a small range) that is dependable across different JVMs. If you really must synchronize on an Integer, then you need to keep a Map or Set of Integer so you can guarantee that you're getting the exact instance you want.

Better would be to create a new object, perhaps stored in a HashMap that is keyed by the Integer, to synchronize on. Something like this:

public Page getPage(Integer id) {
  Page p = cache.get(id);
  if (p == null) {
    synchronized (getCacheSyncObject(id)) {
      p = getFromDataBase(id);
      cache.store(p);
    }
  }
}

private ConcurrentMap<Integer, Integer> locks = new ConcurrentHashMap<Integer, Integer>();

private Object getCacheSyncObject(final Integer id) {
  locks.putIfAbsent(id, id);
  return locks.get(id);
}

To explain this code, it uses ConcurrentMap, which allows use of putIfAbsent. You could do this:

  locks.putIfAbsent(id, new Object());

but then you incur the (small) cost of creating an Object for each access. To avoid that, I just save the Integer itself in the Map. What does this achieve? Why is this any different from just using the Integer itself?

When you do a get() from a Map, the keys are compared with equals() (or at least the method used is the equivalent of using equals()). Two different Integer instances of the same value will be equal to each other. Thus, you can pass any number of different Integer instances of "new Integer(5)" as the parameter to getCacheSyncObject and you will always get back only the very first instance that was passed in that contained that value.

There are reasons why you may not want to synchronize on Integer ... you can get into deadlocks if multiple threads are synchronizing on Integer objects and are thus unwittingly using the same locks when they want to use different locks. You can fix this risk by using the

  locks.putIfAbsent(id, new Object());

version and thus incurring a (very) small cost to each access to the cache. Doing this, you guarantee that this class will be doing its synchronization on an object that no other class will be synchronizing on. Always a Good Thing.

Eddie
  • 53,828
  • 22
  • 125
  • 145
  • An implementer would have to expand this solution to synchronize the locks map and add some sort of reference count or release mechanism. – McDowell Mar 18 '09 at 20:35
  • Yes, if an item is freed from the page cache, then you want to release the lock that was used for that page cache item. – Eddie Mar 18 '09 at 20:37
  • Is there a race inside getCacheSyncObject? Does synchronized take effect on the result of the call or some other object to compute the value of the expression? There's an opening where it might end up synchronizing on 2 different objects. The putIfAbsent from another solution would probably work. – johnny Mar 19 '09 at 07:08
  • @johnny: Yes, I updated my answer to account for this. – Eddie Mar 19 '09 at 15:11
  • 1
    Doesn't this map grow unbounded? Would it make sense to remove the lock object entry from the map inside the end of the synchronized block inside a finally block? – emmby Mar 29 '13 at 23:33
  • If you remove the lock object in a synchronized block, then you lose the ability to be certain you are securely locking across multiple threads. This map only grows unbounded if you don't manage what enters the cache. That's a separate problem though. – Eddie Apr 08 '13 at 17:15
  • I recently implemented a solution like this on more complex object where I synchronize on their hash codes. (The application should not access a resource synchroniously which is mapped by different coordinates.) I use weak references in order to allow the map to be cleared from non-reqired entries and link the keys to themselfs as well. Works like a charm. – Rafael Winterhalter Jun 29 '13 at 20:40
  • I currently have a problem like this in my app, and it seems a good solution for me. Does a monitoring thread which deletes any "old" entry in the map seems good for you ? I'd have to put something like a timestamp to evaluate the "age" of the entry though... – Lebowski Jun 25 '14 at 12:40
  • There is one place that does not look very reliable. Namely after this line `if (p == null) {` another thread might run first and persist `Page` into a database first, so making another call to `cache.store(p);` might not make sense or even cause bugs. – Sergei Ledvanov Dec 29 '15 at 07:50
  • 1
    @SergeiLedvanov: Yes, that's a race that is in the OP's code. Making this code threadsafe would require changing getPage(). One option would be moving `Page p = cache.get(id);` and the `if` inside the `synchronized (getCacheSyncObject(id))` block. You always pay the cost of the synchronized ... but you'll be threadsafe. – Eddie Jan 05 '16 at 23:03
  • Shouldn't `locks` be a static field in this case? – Rajith Gun Hewage Jul 23 '18 at 22:20
  • Is there a reason `getCacheSyncObject` doesn't need to be synchronized? It seems like there could be a race condition, but I guess it wouldn't matter as long as keys were never removed, right? – Addison Jun 04 '19 at 18:43
5

Use a thread-safe map, such as ConcurrentHashMap. This will allow you to manipulate a map safely, but use a different lock to do the real computation. In this way you can have multiple computations running simultaneous with a single map.

Use ConcurrentMap.putIfAbsent, but instead of placing the actual value, use a Future with computationally-light construction instead. Possibly the FutureTask implementation. Run the computation and then get the result, which will thread-safely block until done.

Martin Schneider
  • 3,268
  • 4
  • 19
  • 29
Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
4

Integer.valueOf() only returns cached instances for a limited range. You haven't specified your range, but in general, this won't work.

However, I would strongly recommend you not take this approach, even if your values are in the correct range. Since these cached Integer instances are available to any code, you can't fully control the synchronization, which could lead to a deadlock. This is the same problem people have trying to lock on the result of String.intern().

The best lock is a private variable. Since only your code can reference it, you can guarantee that no deadlocks will occur.

By the way, using a WeakHashMap won't work either. If the instance serving as the key is unreferenced, it will be garbage collected. And if it is strongly referenced, you could use it directly.

erickson
  • 265,237
  • 58
  • 395
  • 493
3

Using synchronized on an Integer sounds really wrong by design.

If you need to synchronize each item individually only during retrieve/store you can create a Set and store there the currently locked items. In another words,

// this contains only those IDs that are currently locked, that is, this
// will contain only very few IDs most of the time
Set<Integer> activeIds = ...

Object retrieve(Integer id) {
    // acquire "lock" on item #id
    synchronized(activeIds) {
        while(activeIds.contains(id)) {
            try { 
                activeIds.wait();   
            } catch(InterruptedExcption e){...}
        }
        activeIds.add(id);
    }
    try {

        // do the retrieve here...

        return value;

    } finally {
        // release lock on item #id
        synchronized(activeIds) { 
            activeIds.remove(id); 
            activeIds.notifyAll(); 
        }   
    }   
}

The same goes to the store.

The bottom line is: there is no single line of code that solves this problem exactly the way you need.

Antonio
  • 2,406
  • 1
  • 16
  • 8
  • +1 I essentially like this approach. There's just the potential issue that I mention in my post below that under high contention, waiting threads for ALL IDs are woken up when any one ID becomes available. – Neil Coffey Mar 19 '09 at 00:26
  • This may not be a real problem if you don't have too many threads. But if that's the case you can use Object.notify() and use Object.wait(maxTime) just to make sure no caller gets stuck ever. – Antonio Mar 19 '09 at 02:27
  • 1
    Agree probably not too much of a problem under low contention. Unfortunately, I think the solution you mention could actually be worse...! – Neil Coffey Mar 19 '09 at 04:41
  • Keep in mind that "real while(){wait...} contention" happens only to acquire the lock, not during load/store. This means that if you have 100 threads trying to acquire the lock at the same time then you have contention anyway – Antonio Mar 19 '09 at 14:02
  • Also, the maxTime value is there simply to avoid potential problems. I see a different problem though: the lack of fairness to get the lock. This really may be a problem, not what you mention, IMHO. A general solution (for all scenarios) would require a complete rewrite I admit. – Antonio Mar 19 '09 at 14:05
1

How about a ConcurrentHashMap with the Integer objects as keys?

starblue
  • 55,348
  • 14
  • 97
  • 151
  • Yes, as I mentioned, I could use some sort of map where the objects would be private locks. I was more interested in finding out if there was some way in java of guaranteeing a specific Integer instance. – Steve B. Mar 18 '09 at 20:29
1

You could have a look at this code for creating a mutex from an ID. The code was written for String IDs, but could easily be edited for Integer objects.

McDowell
  • 107,573
  • 31
  • 204
  • 267
1

See section 5.6 in Java Concurrency in Practice: "Building an efficient, scalable, result cache". It deals with the exact issue you are trying to solve. In particular, check out the memoizer pattern.

alt text
(source: umd.edu)

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Julien Chastang
  • 17,592
  • 12
  • 63
  • 89
1

As you can see from the variety of answers, there are various ways to skin this cat:

  • Goetz et al's approach of keeping a cache of FutureTasks works quite well in situations like this where you're "caching something anyway" so don't mind building up a map of FutureTask objects (and if you did mind the map growing, at least it's easy to make pruning it concurrent)
  • As a general answer to "how to lock on ID", the approach outlined by Antonio has the advantage that it's obvious when the map of locks is added to/removed from.

You may need to watch out for a potential issue with Antonio's implementation, namely that the notifyAll() will wake up threads waiting on all IDs when one of them becomes available, which may not scale very well under high contention. In principle, I think you can fix that by having a Condition object for each currently locked ID, which is then the thing that you await/signal. Of course, if in practice there's rarely more than one ID being waited on at any given time, then this isn't an issue.

Neil Coffey
  • 21,615
  • 7
  • 62
  • 83
1

Steve,

your proposed code has a bunch of problems with synchronization. (Antonio's does as well).

To summarize:

  1. You need to cache an expensive object.
  2. You need to make sure that while one thread is doing the retrieval, another thread does not also attempt to retrieve the same object.
  3. That for n-threads all attempting to get the object only 1 object is ever retrieved and returned.
  4. That for threads requesting different objects that they do not contend with each other.

pseudo code to make this happen (using a ConcurrentHashMap as the cache):

ConcurrentMap<Integer, java.util.concurrent.Future<Page>> cache = new ConcurrentHashMap<Integer, java.util.concurrent.Future<Page>>;

public Page getPage(Integer id) {
    Future<Page> myFuture = new Future<Page>();
    cache.putIfAbsent(id, myFuture);
    Future<Page> actualFuture = cache.get(id);
    if ( actualFuture == myFuture ) {
        // I am the first w00t!
        Page page = getFromDataBase(id);
        myFuture.set(page);
    }
    return actualFuture.get();
}

Note:

  1. java.util.concurrent.Future is an interface
  2. java.util.concurrent.Future does not actually have a set() but look at the existing classes that implement Future to understand how to implement your own Future (Or use FutureTask)
  3. Pushing the actual retrieval to a worker thread will almost certainly be a good idea.
Pat
  • 5,761
  • 5
  • 34
  • 50