13

I have an application that wants to keep open many files: periodically it receives a client request saying "add some data to file X", and it would be ideal to have that file already opened, and the file's header section already parsed, so that this write is fast. However, keeping open this many files is not very nice to the operating system, and could become impossible if our data-storage needs grow.

So I would like a "give me this file handle, opening if it's not cached" function, and some process for automatically closing files which have not been written to for, say, five minutes. For the specific case of caching file handles which are written to in short spurts, this is probably enough, but this seems a general enough problem that there ought to be functions like "give me the object named X, from cache if possible" and "I'm now done with object X, so make it eligible for eviction five minutes from now".

core.cache looks like it might be suitable for this purpose, but the documentation is quite lacking and the source provides no particular clues about how to use it. Its TTLCache looks promising, but as well as being unclear how to use it relies on garbage collection to evict items, so I can't cleanly close a resource when I'm ready to expire it.

I could roll my own, of course, but there are a number of tricky spots and I'm sure I would get some things subtly wrong, so I'm hoping someone out there can point me to an implementation of this functionality. My code is in clojure, but of course using a java library would be perfectly fine if that's where the best implementation can be found.

amalloy
  • 89,153
  • 8
  • 140
  • 205

2 Answers2

6

Check out Guava's cache implementation.

  • You can supply a Callable (or a CacheLoader) to the get method for "if handle is cached, return it, otherwise open, cache and return it" semantics
  • You can configure timed eviction such as expireAfterAccess
  • You can register a RemovalListener to close the handles on removal

Modifying the code examples from the linked Guava page slightly, using CacheLoader:

LoadingCache<Key, Handle> graphs = CacheBuilder.newBuilder()
   .maximumSize(100) // sensible value for open handles?
   .expireAfterAccess(5, TimeUnit.MINUTES)
   .removalListener(removalListener)
   .build(
       new CacheLoader<Key, Handle>() {
         public Handle load(Key key) throws AnyException {
           return openHandle(key);
         }
       });

RemovalListener<Key, Handle> removalListener = 
  new RemovalListener<Key, Handle>() {
    public void onRemoval(RemovalNotification<Key, Handle> removal) {
      Handle h = removal.getValue();
      h.close(); // tear down properly
    }
  };

* DISCLAIMER * I have not used the cache myself this way, ensure you test this sensibly.

Jens Hoffmann
  • 6,699
  • 2
  • 25
  • 31
  • This looks quite suitable, thanks! The only thing I'm worried about is that it looks possible to get an item from the cache and hang onto it for longer than the expiration time, actively using it all along. If the expiration code tears the item down, that becomes dangerous. Do you know of a way to "check an object out" of the cache such that it won't be torn down until it is checked back in? – amalloy Feb 12 '13 at 23:04
  • Hm, I think it depends. Could it happen that you continuously write to a file for >= 5 minutes? Is the cache accessed from one request at a time only? Make sure you check the documentation here: Eviction does not happen automatically on expiry, but during reads/writes from/to the Cache. Maybe that's good (if you have one request which could write for a long time, then it won't be torn down), maybe bad because there is more than one request at at time or you want to clean up handles earlier/in the background... – Jens Hoffmann Feb 13 '13 at 00:26
  • See here: http://code.google.com/p/guava-libraries/wiki/CachesExplained#When_Does_Cleanup_Happen? – Jens Hoffmann Feb 13 '13 at 00:30
  • @amalloy, doesn't this mean what you really want is a cache of "checked back in" objects, where everything in this cache the handles remain open (instead of closing) and you grab handles in this pool first before building one yourself. This makes caching independent of checking in/out. – bmillare Feb 13 '13 at 14:33
  • Maybe try this: https://stackoverflow.com/questions/17678269/third-party-lib-for-object-pool-with-expiration-time-in-java – ed22 Feb 28 '18 at 12:47
1

If you don't mind some java, see http://ehcache.org/ and http://ehcache.org/apidocs/net/sf/ehcache/event/CacheEventListener.html.

DanLebrero
  • 8,545
  • 1
  • 29
  • 30