19

I have a cache which I implemented using a simeple HashMap. like -

HashMap<String,String> cache = new HashMap<String,String>();

This cache is used most of the time to read values from it. I have another method which reloads the cache and inside of this method I basically create a new cache and then assign the reference. As I understand assignment of object reference is Atomic in Java.

public class myClass {
     private HashMap<String,String> cache = null;
    public void init() {
       refreshCache();
    }
    // this method can be called occasionally to update the cache.
    public void refreshCache() {
        HashMap<String,String> newcache = new HashMap<String,String>();
       // code to fill up the new cache
       // and then finally
       cache = newcache; //assign the old cache to the new one in Atomic way
    }
}

I understand that if I do not declare cache as volatile, other threads will not be able to see the changes but it is not time critical for my use case to propagate the change in cache to other threads and they can continue to work with old cache for extended time.

Do you see any threading issue? Consider that many threads are reading from the cache and only at times the cache is reloaded.

EDIT- My main confusion is I do not have to use AtomicReference here as the assignment operation itself is atomic?

EDIT - I understand that to make the ordering proper, I should mark cache as volatile. But If refreshCache method is marked as synchronized, I do not have to make cache as volatile, as Synchronized block will take care of ordering as well as visibility?

Shamik
  • 6,938
  • 11
  • 55
  • 72
  • for simple caching purpose, I typically use a *ConcurrentHashMap* and do not worry about locking. I usually don't care if a (referentially transparent) computation happens twice (say because thread B starts to compute the same value that thread A is currently computing and, hence, hasn't put in the cache yet). In your case I'd also make *cache* to be *volatile*. But I don't see the point in "refreshing the cache". Make your cache a LRU/MRU and simply add new cached value to it instead of "resetting" it. – SyntaxT3rr0r Mar 15 '11 at 05:32
  • At least refreshing the cache using a new map allows for use of a non-blocking cache map implementation – ThomasRS Apr 19 '12 at 11:31

4 Answers4

29

It is not safe without a proper memory barrier.

One would think that the assignment of cache (cache = newCache) would happen after the steps to populate the cache. However, other threads may suffer from reordering of these statements so that the assignment may appear to happen before populating the cache. Thus, it is possible to grab the new cache before it's fully constructed or even worse see a ConcurrentModificationException.

You need to enforce the happens-before relationship to prevent this reordering, and declaring the cache as volatile would achieve that.

sjlee
  • 7,726
  • 2
  • 29
  • 37
  • 1
    This is a great advice. I had this question in mind and thanks for clearing it. If the ordering is modified then its a disaster for sure. – Shamik Mar 15 '11 at 15:19
  • What happens if I make refreshCache- a synchronized method? – Shamik Mar 15 '11 at 19:26
  • You may well want to make it synchronized if you want to ensure only one thread gets to change the data at a given time. However, I'm assuming reader threads will not participate in the locking (they go through a different method to read the value?), in which case synchronization alone will not give you the protection. Making the cache field volatile is an efficient way to achieve the happens-before relationship. – sjlee Mar 15 '11 at 19:37
  • Yes reader threads do not participate in the locking and they go through different methods to read the value. I had to synchronize so with other writer threads to the cache, but you are right, I still need volatile to protect the ordering. – Shamik Mar 15 '11 at 19:52
  • If you use volatile for the cache field, the reader threads do not need to be synchronized, which is a good thing in this case. – sjlee Mar 15 '11 at 20:06
  • 2
    good answer, but I just want to add some required reading: http://www.cs.umd.edu/~pugh/java/memoryModel/ – searchengine27 Feb 18 '15 at 16:59
  • 1
    @searchengine27: thank you for your interesting link. It contains a lot of useful material, perhaps too much for many readers, so I want to put in evidence just this one that explains why volatile is needed: http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html Since that FAQ is written by Jeremy Manson and Brian Goetz, there is no better reference. – Pino Sep 07 '16 at 08:38
4

You should mark the cache as volatile.

While you note that other threads may continue using a stale cache for "a long time" you should note that, without a synchronization edge, they are likely to continue using the stale cache forever. That's probably not the desired behavior.

In order of preference (mostly due to readability):

  • Update the field in a synchronized method
  • Use AtomicReference<Map<>>
  • Use volatile

See also this question.

Community
  • 1
  • 1
andersoj
  • 22,406
  • 7
  • 62
  • 73
0

What about CopyOnWrite.. collections:

java.util.concurrent.CopyOnWriteArraySet
java.util.concurrent.CopyOnWriteArraySet
and org.apache.mina.util.CopyOnWriteMap

They can be a good match in your case and they are thread-safe.

ruruskyi
  • 2,018
  • 2
  • 26
  • 37
-1

Seems to be OK. Be sure refreshCache is not called too frequently or mark as synchronized.

David Oliván
  • 2,717
  • 1
  • 19
  • 26