my ideal cache using guava

Question

Off and on for the past few weeks I've been trying to find my ideal cache implementation using guava's MapMaker. See my previous two questions here and here to follow my thought process.

Taking what I've learned, my next attempt is going to ditch soft values in favor of maximumSize and expireAfterAccess:

ConcurrentMap<String, MyObject> cache = new MapMaker()
        .maximumSize(MAXIMUM_SIZE)
        .expireAfterAccess(MINUTES_TO_EXPIRY, TimeUnit.MINUTES)
        .makeComputingMap(loadFunction);

where

Function<String, MyObject> loadFunction = new Function<String, MyObject>() {
   @Override
   public MyObject apply(String uidKey) {
      return getFromDataBase(uidKey);
   }
};

However, the one remaining issue I'm still grappling with is that this implementation will evict objects even if they are strongly reachable, once their time is up. This could result in multiple objects with the same UID floating around in the environment, which I don't want (I believe what I'm trying to achieve is known as canonicalization).

So as far as I can tell the only answer is to have an additional map which functions as an interner that I can check to see if a data object is still in memory:

ConcurrentMap<String, MyObject> interner = new MapMaker()
        .weakValues()
        .makeMap();

and the load function would be revised:

Function<String, MyObject> loadFunction = new Function<String, MyObject>() {
   @Override
   public MyObject apply(String uidKey) {
      MyObject dataObject = interner.get(uidKey);
      if (dataObject == null) {
         dataObject = getFromDataBase(uidKey);
         interner.put(uidKey, dataObject);
      }
      return dataObject;
   }
};

However, using two maps instead of one for the cache seems inefficient. Is there a more sophisticated way to approach this? In general, am I going about this the right way, or should I rethink my caching strategy?

A bit late, but you may be still be interested to understand the memory cost of each of the different options to make a cache: http://code-o-matic.blogspot.com/2012/02/updated-memory-cost-per-javaguava.html — Dimitris Andreou, Feb 08 '12 at 19:10
@DimitrisAndreou - Very interesting, thanks for the link! I'm planning to update this post for `Cache`/`CacheBuilder` once I get to play with them more. — Paul Bellora, Feb 08 '12 at 21:09

score 8 · Accepted Answer · answered Jul 25 '11 at 06:39

Whether two maps is efficient depends entirely on how expensive getFromDatabase() is, and how big your objects are. It does not seem out of all reasonable boundaries to do something like this.

As for the implementation, It looks like you can probably layer your maps in a slightly different way to get the behavior you want, and still have good concurrency properties.

Create your first map with weak values, and put the computing function getFromDatabase() on this map.
The second map is the expiring one, also computing, but this function just gets from the first map.

Do all your access through the second map.

In other words, the expiring map acts to pin a most-recently-used subset of your objects in memory, while the weak-reference map is the real cache.

-dg

+1 Thanks, you're right that loadFunction's added logic is doing what the "interner" map could do as a second computing map. I think this is mainly a cosmetic difference, since the "interner" map is only accessed through the expiring one (already handling concurrent requests), but more elegant. Your answer seems to support what I'm doing - I'm gonna see if there are other suggestions before accepting — Paul Bellora, Jul 25 '11 at 23:54

score 0 · Answer 2 · answered Jul 25 '11 at 19:40

0

I don't understand the full picture here, but two things.

Given this statement: "this implementation will evict objects even if they are strongly reachable, once their time is up. This could result in multiple objects with the same UID floating around in the environment, which I don't want." -- it sounds like you just need to use weakKeys() and NOT use either timed or size-based eviction.
Or if you do want to bring an "interner" into this, I'd use a real Interners.newWeakInterner.

answered Jul 25 '11 at 19:40

Kevin Bourrillion

40,336
12
74
87

Thanks for the input - I guess I don't fully understand using weakKeys(), especially if the key is String or UUID, etc. Would I then need a wrapper class to act as key? Also, I don't grasp the strategy of using weak keys and ditching time- and size-based eviction. I definitely do want to keep a cached object strongly reachable for at least x minutes after access (or don't I?). My issue is with what happens after that, if the object still happens to be in the clutches of a long running process. It could be I'm missing something about weak keys. – Paul Bellora Jul 26 '11 at 00:03
About the "interner" - I considered using a real Interner, however it only supports the method intern(), and no way to check whether an object is actually in there or not. A weak map gives me the ability to know whether I have the object in memory, or if it needs to be loaded. Maybe I'm using the term too loosely? – Paul Bellora Jul 26 '11 at 00:12
Hey Kevin I'm accepting Darren's answer but would appreciate clarification on weakKeys() if you get the chance. thanks! – Paul Bellora Jul 27 '11 at 14:51

my ideal cache using guava

2 Answers2

Linked