cache2k, read through and blocking

Question

I have used cache2k with read through in a web application to load blog posts on demand. However, I am concerned about blocking for the read through feature. For example, if multiple threads (requests) ask the cache for the same key, is it possible for the read through method to be called multiple times to load the same key/value into the cache?

I get the impression from the documentation that the read through feature does block concurrent requests for the same key until the load has completed, but may I have mis-read the documentation. I just want to check that this is the behaviour.

The method which initializes the cache looks like this:

private void initializeURItoPostCache()
{
    final CacheLoader<String, PostImpl> postFileLoader = new CacheLoader<String, PostImpl>(){
        @Override public PostImpl load(String uri)
        {
            // Fetch the data and create the post object
            final PostImpl post = new PostImpl();
            //.. code omitted
            return post;
        }
    };

    // Initialize the cache with a read-through loader
    this.cacheUriToPost = new Cache2kBuilder<String, PostImpl>(){}
    .name("cacheBlogPosts")
    .eternal(true)
    .loader(postFileLoader)
    .build();
}

The following method is used to request a post from the cache:

public Post getPostByURI(final String uri)
{
    // Check with the index service to ensure the URI is known (valid to the application)
    if(this.indexService.isValidPostURI(uri))
    {
        // We have a post associated with the given URI, so
        // request it from the cache
        return this.cacheUriToPost.get(uri);
    }
    return EMPTY_POST;
}

Many thanks in advance, and a happy and prosperous New Year to all.

Thanks for your question! Maybe the term "thread safety" leads us a little in the wrong direction. I think the relevant documentation is at https://cache2k.org/docs/1.0/apidocs/cache2k-api/org/cache2k/integration/CacheLoader.html under the term "blocking". Anyhow, I'd like to craft a good answer and address your concerns properly. Can you give a tiny code snippet how your loader code looks like? Do you need or why do you need absolute guarantees that no load is done at the same time for the same key? — cruftex, Dec 30 '16 at 07:02
Many thanks for the reply. I did not see the _blocking_ comments in the documentation you have linked, and can see that my concerns were unfounded. — , Dec 30 '16 at 10:04
In answer to your question @cruftex, I was concerned that two or more requests would try to instantiate and load a Post object associated with a unique 'uri' into the cache and this would present an issue (I am not familiar with using caches). Again, it is a case of (talking to myself) RTFM! — , Dec 30 '16 at 10:12
Never mind. Shows me where the documentation can improve :) Since this is a very important feature, it is actually mentioned multiple times, e.g. the feature list says "blocking read-through" and in the user guide it says "protection against the cache stampede". But it is not obvious that this is the same thing. Then, the real semantic is described in the Java Doc. — cruftex, Dec 30 '16 at 14:37
Can you do me a favor and maybe rephrase your question without the term "thread safety"? Thread safety means something else; being asked for thread safety is quite embarrassing ;) — cruftex, Dec 31 '16 at 08:36

cruftex · Answer 1 · 2016-12-31T08:41:32.190

When multiple requests to the same key will provoke a cache loader call, cache2k will only invoke the loader once. Other threads wait until the load is finished. This behavior is called blocking read through. To cite from the Java Doc:

Blocking: If the loader is invoked by Cache.get(K) or other methods that allow transparent access concurrent requests on the same key will block until the loading is completed. For expired values blocking can be avoided by enabling Cache2kBuilder.refreshAhead(boolean). There is no guarantee that the loader is invoked only for one key at a time. For example, after Cache.clear() is called load operations for one key may overlap.

This behavior is very important for caches, since it protects against the Cache stampede. An example: A high traffic website receives 1000 requests per second. One resource takes quite long to generate, about 100 milliseconds. When the cache is not blocking out the multiple requests when there is a cache miss, there would be at least 100 requests hitting the loader for the same key. "at least" is an understatement, since your machine will probably not handle 100 requests at the same speed then one.

Keep in mind that there is no hard guarantee by the cache. The loader must still be able to perform correctly when called for the same key at the same time. For example blocking read through and Cache.clear() lead to competing requirements. The Cache.clear() should be fast, which means we don't want to wait for ongoing load operations to finish.

cache2k, read through and blocking

1 Answers1