2

I am using caffeine in the following configuration:

    Cache<String, String> cache = Caffeine.newBuilder()
                .executor(newWorkStealingPool(15))
                .scheduler(createScheduler())
                .expireAfterWrite(10, TimeUnit.SECONDS)
                .maximumSize(MAXIMUM_CACHE_SIZE)
                .removalListener(this::onRemoval)
                .build();


    private Scheduler createScheduler() {
        return forScheduledExecutorService(newSingleThreadScheduledExecutor());
    }

will I be correct to assume that onRemoval method will be executed on the newWorkStealingPool(15) ForkJoinPool, and the scheduler will be invoked only to find the expired entries that needs to be evicted?

meaning it will go something like this:

  1. single thread scheduler is invoked (every ~ 1 second)
  2. find all the expired entries to be evicted
  3. execute onRemoval for each of the evicted entries in the newWorkStealingPool(15) define in the cache builder?

I didn't found documentation that explains this behavior, so I am asking here

Tnx

Roie Beck
  • 1,113
  • 3
  • 15
  • 30

1 Answers1

4

You're assumption is close, except that it is slightly more optimized in practice.

  1. Cache reads and writes are performed on the underlying hash table and appended to internal ring buffers.
  2. When the buffers reach thresholds then a task is submitted to Caffeine.executor to call Cache.cleanUp.
  3. When this maintenance cycle runs (under a lock),
    • The buffers are drained and the events replayed against the eviction policies (e.g. LRU reordering)
    • Any evictable entry is discarded and a task is submitted to Caffeine.executor to call RemovalListener.onRemoval.
    • The duration until the next entry will expire is calculated and submitted to the scheduler. This is guarded by a pacer so avoid excessive scheduling by ensuring that ~1s occurs between scheduled tasks.
  4. When the scheduler runs, a task is submitted to Caffeine.executor to call Cache.cleanUp (see #3).

The scheduler does the minimal amount of work and any processing is deferred to the executor. That maintenance work is cheap due to using O(1) algorithms so it may occurs often based on the usage activity. It is optimized for small batches of work, so the enforced ~1s delay between scheduled calls helps capture more work per invocation. If the next expiration event is in the distant future then the scheduler won't run until then, though calling threads may trigger a maintenance cycle due to their activity on the cache (see #1,2).

Ben Manes
  • 9,178
  • 3
  • 35
  • 39
  • Thanks @Ben Manes, caffeine ia great, say another request if I may, would it be possible to add an addedListener that will run on the same executor service as the removeListener, currently I have to call runAsync manually every time I want to fire an event after adding it to the cache, an addListener will solve this ugly method... Can it be done? – Roie Beck Jun 23 '21 at 18:51
  • 1
    Listeners are a slippery slope (add, update, replace, etc), so only removal is provided as often needed for cleanup. If you think of the cache as a fancy `ConcurrentHashMap` (which `cache.asMap()` is), then you wouldn't expect a listener on it natively. You might use a decorator to add it or do so in your code. We try to only implement advanced features, where simple cases can be left to user code as there is no ambiguity of the behavior. – Ben Manes Jun 23 '21 at 19:01
  • tnx for the response @Ben Manes, last question, In case of java 8(no Sys.scheduler), is there any added benefits to use a multiple thread pool scheduler and not a single threaded scheduler? the way I see it any pool bigger than 1 even for multiple caffeine caches(lets say ~5) is an over kill, or am I mistaken? – Roie Beck Jun 24 '21 at 06:47
  • 1
    @RoieBeck You're right, and the recommended Java 9+ system scheduler is single threaded. Similarly, you may not need a powerful `Executor` as the cache's own work is very cheap, but we don't know the cost of user callbacks (like your removal listener). You can use a direct executor (`Runnable::run`) that runs on the calling thread and most of the time that is perfectly fine. – Ben Manes Jun 24 '21 at 07:31