4

Which data structure in Java would be best to implement an in-memory object cache, where objects have an individual expiration time?

Basically for the cache I could use a Map (where key could be a String) which offers the put and get methods, and use a ordered list of "timestamp"+"object" pairs to manage the expiration time. So a cleanup thread could check the first list entry and delete the object when its expiration time passed. (Deletion of the first element should be in O(1) time)

mjn
  • 36,362
  • 28
  • 176
  • 378

6 Answers6

11

What you're describing building is basically my ExpiringMap. There are other similar implementations, such as Guava (see CacheBuilder) - though I don't believe it supports per-entry expiration as ExpiringMap does.

Makyen
  • 31,849
  • 12
  • 86
  • 121
Jonathan
  • 5,027
  • 39
  • 48
  • 1
    +1 for Guava`s CacheBuilder. I consider it the most appropriate suggestion as it is easy to use and lightweight. – Philipp Wendler Mar 15 '15 at 20:13
  • Upvoted because the OP is not asking for a farm of cache servers, but for an in-memory cache structure. Guava `Cache` is the best fit here. – fps Mar 16 '15 at 14:47
  • how can one set expiration per object with guava? – sodik Mar 17 '15 at 09:09
  • @sodik I don't think Guava supports per-entry expiration. [ExpiringMap](https://github.com/jhalterman/expiringmap) does though. Updated the answer to reflect this. – Jonathan Mar 18 '15 at 16:15
6

Caching frameworks are pretty mature now:

However, if you insist on reinventing the wheel, remember to account for memory utilisation. All too often I see a badly implemented cache (HashMap) effectively turn into a memory leak.

See Cowan's answer here: Java's WeakHashMap and caching: Why is it referencing the keys, not the values?

Community
  • 1
  • 1
Dan Hardiker
  • 3,013
  • 16
  • 19
4

Guava Cachebuilder :

LoadingCache<Key, Graph> graphs = CacheBuilder.newBuilder()
       .maximumSize(10000)
       .expireAfterWrite(10, TimeUnit.MINUTES)
       .removalListener(MY_LISTENER)
       .build(
           new CacheLoader<Key, Graph>() {
             public Graph load(Key key) throws AnyException {
               return createExpensiveGraph(key);
             }
           });

Since WeekHashmap doesn't fit caching but you can always utilize Map<K,WeakReference<V>> whose value become eligible for GC for week references.

Above all we always have EhCache ,Memcached and coherence as popular choice.

Om.
  • 2,532
  • 4
  • 22
  • 22
4

I would consider using an existing library like ehcache.

However, if you want to write your own, I would not use a background thread unless you need it as it adds complexity. Instead I would have the foreground thread remove expired entries.

I would use LinkedHashMap if you just need an LRU cache. However if you want timed expiry, I would use a HashMap with a PriorityQueue (so you can check whether the next to expire entry has expired)

dfeuer
  • 48,079
  • 5
  • 63
  • 167
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • Indeed, `LinkedHashMap` is an excellent choice. But one thing to clarify: to achive removal of the expired entries you would combine `LinkedHashMap` with some thread that does this job, correct? If yes, don't you think that it will slightly decrease performance of such cache? And secondly: why foreground thread instead of background? – G. Demecki Dec 16 '14 at 09:34
  • 2
    @GrzesiekD. You can remove expired items when you access the map instead of using background thread. – Peter Lawrey Dec 16 '14 at 17:03
1

As mentioned in previous answers, it is better to use one of the popular in-memory caches like EhCache, Memcached etc.

But just as you wanted implement it by your own cache which has the object expiry feature and less time complexity, I tried to implement it like this - (any test reviews/suggestions much appreciated)..

public class ObjectCache<K, V> {

    private volatile boolean shutdown;
    private final long maxObjects;
    private final long timeToLive;
    private final long removalThreadRunDelay;
    private final long objectsToRemovePerRemovalThreadRun;
    private final AtomicLong objectsCount;
    private final Map<K, CacheEntryWrapper> cachedDataStore;
    private final BlockingQueue<CacheEntryReference> queue;
    private final Object lock = new Object();
    private ScheduledExecutorService executorService;

    public ObjectCache(long maxObjects, long timeToLive, long removalThreadRunDelay, long objectsToRemovePerRemovalThreadRun) {
        this.maxObjects = maxObjects;
        this.timeToLive = timeToLive;
        this.removalThreadRunDelay = removalThreadRunDelay;
        this.objectsToRemovePerRemovalThreadRun = objectsToRemovePerRemovalThreadRun;
        this.objectsCount = new AtomicLong(0);
        this.cachedDataStore = new HashMap<K, CacheEntryWrapper>();
        this.queue = new LinkedBlockingQueue<CacheEntryReference>();
    }

    public void put(K key, V value) {
        if (key == null || value == null) {
            throw new IllegalArgumentException("Key and Value both should be not null");
        }
        if (objectsCount.get() + 1 > maxObjects) {
            throw new RuntimeException("Max objects limit reached. Can not store more objects in cache.");
        }
        // create a value wrapper and add it to data store map
        CacheEntryWrapper entryWrapper = new CacheEntryWrapper(key, value);
        synchronized (lock) {
            cachedDataStore.put(key, entryWrapper);
        }
        // add the cache entry reference to queue which will be used by removal thread
        queue.add(entryWrapper.getCacheEntryReference());
        objectsCount.incrementAndGet();
        // start the removal thread if not started already
        if (executorService == null) {
            synchronized (lock) {
                if (executorService == null) {
                    executorService = Executors.newSingleThreadScheduledExecutor();
                    executorService.scheduleWithFixedDelay(new CacheEntryRemover(), 0, removalThreadRunDelay, TimeUnit.MILLISECONDS);
                }
            }
        }
    }

    public V get(K key) {
        if (key == null) {
            throw new IllegalArgumentException("Key can not be null");
        }
        CacheEntryWrapper entryWrapper;
        synchronized (lock) {
            entryWrapper = cachedDataStore.get(key);
            if (entryWrapper != null) {
                // reset the last access time
                entryWrapper.resetLastAccessedTime();
                // reset the reference (so the weak reference is cleared)
                entryWrapper.resetCacheEntryReference();
                // add the new reference to queue
                queue.add(entryWrapper.getCacheEntryReference());
            }
        }
        return entryWrapper == null ? null : entryWrapper.getValue();
    }

    public void remove(K key) {
        if (key == null) {
            throw new IllegalArgumentException("Key can not be null");
        }
        CacheEntryWrapper entryWrapper;
        synchronized (lock) {
            entryWrapper = cachedDataStore.remove(key);
            if (entryWrapper != null) {
                // reset the reference (so the weak reference is cleared)
                entryWrapper.resetCacheEntryReference();
            }
        }
        objectsCount.decrementAndGet();
    }

    public void shutdown() {
        shutdown = true;
        executorService.shutdown();
        queue.clear();
        cachedDataStore.clear();
    }

    public static void main(String[] args) throws Exception {
        ObjectCache<Long, Long> cache = new ObjectCache<>(1000000, 60000, 1000, 1000);
        long i = 0;
        while (i++ < 10000) {
            cache.put(i, i);
        }
        i = 0;
        while(i++ < 100) {
            Thread.sleep(1000);
            System.out.println("Data store size: " + cache.cachedDataStore.size() + ", queue size: " + cache.queue.size());
        }
        cache.shutdown();
    }

    private class CacheEntryRemover implements Runnable {
        public void run() {
            if (!shutdown) {
                try {
                    int count = 0;
                    CacheEntryReference entryReference;
                    while ((entryReference = queue.peek()) != null && count++ < objectsToRemovePerRemovalThreadRun) {
                        long currentTime = System.currentTimeMillis();
                        CacheEntryWrapper cacheEntryWrapper = entryReference.getWeakReference().get();
                        if (cacheEntryWrapper == null || !cachedDataStore.containsKey(cacheEntryWrapper.getKey())) {
                            queue.poll(100, TimeUnit.MILLISECONDS); // remove the reference object from queue as value is removed from cache
                        } else if (currentTime - cacheEntryWrapper.getLastAccessedTime().get() > timeToLive) {
                            synchronized (lock) {
                                // get the cacheEntryWrapper again just to find if put() has overridden the same key or remove() has removed it already
                                CacheEntryWrapper newCacheEntryWrapper = cachedDataStore.get(cacheEntryWrapper.getKey());
                                // poll the queue if -
                                // case 1 - value is removed from cache
                                // case 2 - value is overridden by new value
                                // case 3 - value is still in cache but it is old now
                                if (newCacheEntryWrapper == null || newCacheEntryWrapper != cacheEntryWrapper || currentTime - cacheEntryWrapper.getLastAccessedTime().get() > timeToLive) {
                                    queue.poll(100, TimeUnit.MILLISECONDS);
                                    newCacheEntryWrapper = newCacheEntryWrapper == null ? cacheEntryWrapper : newCacheEntryWrapper;
                                    if (currentTime - newCacheEntryWrapper.getLastAccessedTime().get() > timeToLive) {
                                        remove(newCacheEntryWrapper.getKey());
                                    }
                                } else {
                                    break; // try next time
                                }
                            }
                        }
                    }
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
        }
    }

    private class CacheEntryWrapper {
        private K key;
        private V value;
        private AtomicLong lastAccessedTime;
        private CacheEntryReference cacheEntryReference;

        public CacheEntryWrapper(K key, V value) {
            this.key = key;
            this.value = value;
            this.lastAccessedTime = new AtomicLong(System.currentTimeMillis());
            this.cacheEntryReference = new CacheEntryReference(this);
        }

        public K getKey() {
            return key;
        }

        public V getValue() {
            return value;
        }

        public AtomicLong getLastAccessedTime() {
            return lastAccessedTime;
        }

        public CacheEntryReference getCacheEntryReference() {
            return cacheEntryReference;
        }

        public void resetLastAccessedTime() {
            lastAccessedTime.set(System.currentTimeMillis());
        }

        public void resetCacheEntryReference() {
            cacheEntryReference.clear();
            cacheEntryReference = new CacheEntryReference(this);
        }
    }

    private class CacheEntryReference {
        private WeakReference<CacheEntryWrapper> weakReference;

        public CacheEntryReference(CacheEntryWrapper entryWrapper) {
            this.weakReference = new WeakReference<CacheEntryWrapper>(entryWrapper);
        }

        public WeakReference<CacheEntryWrapper> getWeakReference() {
            return weakReference;
        }

        public void clear() {
            weakReference.clear();
        }
    }
}
hemant1900
  • 1,226
  • 8
  • 9
1

I think your decision is right. I would be using HashMap to be exact.

uahakan
  • 576
  • 1
  • 6
  • 23
  • 1
    Poor choice, `LinkedHashMap` would be much better here, as it has the ability to reduce memory consumption by deleting stale entries. – G. Demecki Dec 16 '14 at 09:37