Is this a truly Thread-Safe LRU Cache Design with Minimal Blocking?

Question

Below is pseudocode for a LRU cache design with minimal blocking that I've been working on.

The cache has the following characteristics:

It has a fixed size MAX_CACHE_SIZE
It is LRU

I want the following conditions to be true:

If a cache miss occurs, no blocking will occur while the data becomes available (downloaded in this case)
The data is not retrieved twice following two concurrent cache misses

Below is the psudocode function getData(name) that returns data from the cache given the name of the data. Does this design seem safe? If not, how can it be improved?


storage : dequeue<data> // container for the cache data
lookupmap : map<string, dequeue::iterator> // lookup map with name of data and position in storage dequeue

data getData(name) {
    lock()
    item_iterator : dequeue::iterator
    item_iterator = lookupmap[name]

    // if item name is present in the lookup map 
    if(item_iterator != null) {
       
       // unlocks and waits while item is being downloaded. Still a cache miss, 
       // but not this threads responsibility to get the data therefore wait for it to be downloaded
       cond_var.wait({return item_iterator.item.isDownloaded}) 
       
       // this removes the item from the queue and adds it to the front, essentially "bumping" it
       item : data
       item = item_iterator.item
       storage.remove(item_iterator)
       storage.push_front(item)
       lookupmap[name] = storage.begin()
       unlock()
       
       return item
    } else {
       // registers that item name but initialises it as null to indicate not downloaded yet
       lookupmap[name] = null
       unlock() // unlocks so that other threads can proceed while peforming download
       item : data
       item = downloadItem(name)
       lock() // locks again once downloaded
    
       // removes last item in cache if storage is full (LRU)
       if(storage.size() == MAX_CACHE_SIZE) {
           tempitem : data
           tempitem = storage.back()
           storage.pop_back()
           lookupmap.remove(tempItem.name)
       }

       // registers new item in cache
       storage.push_front(item)
       lookupMap[dataname] = storage.begin()
       
       unlock()
       cv.notify_all() // notifies any waiting threads that a item has now been downloaded
       return item

    }

}

Edit:

Not sure if this is a concern but just to clarify - there is one global mutex and one global condition variable for this design.

`storage.remove(item_iterator)` followed by `storage.push_front(item_iterator.item)` seems like a bad idea. You delete the item from the container, which invalidates the iterator -- but you are still accessing data off of it. You will want to copy or move out the data first before reinserting. — Human-Compiler, Aug 25 '21 at 14:41
@Human-Compiler good point, I have amended this now. Thanks. — Tom, Aug 25 '21 at 14:43
One suggestion to make it more robust is to replace the `lock`/`unlock` calls (where unlock may never get called at all) by a RAII-style `scoped_lock` — luizbarcelos, Aug 25 '21 at 14:51
@luizbarcelos Yes could work. Although I don't mention it I'm implying that the lock will be unique_lock due to the dependency on condition_variable.wait — Tom, Aug 25 '21 at 14:57
nits: (1) amendment regarding invalidated iterator was incomplete, still referencing it in the return statement. (2) no need to (re-) lock after cv::wait returns (I think!) — WilliamClements, Aug 25 '21 at 18:51

score 1 · Accepted Answer · answered Aug 26 '21 at 14:06

The pseudocode does not describe a thread-safe algorithm.

Consider that multiple threads may be waiting for a particular item to be fully downloaded. Each of them would be holding (in stack memory) the same item_iterator. Download finishes. When the first waiting thread wakes up, to maintain the LRU, it will invalidate iterator still being held by the others.

(Another problem I thought of is OK as long as the maximum number of participating threads is less than MAX_CACHE_SIZE.)

Is it sufficient fix to lookup fresh item_iterator after waking up? Maybe, but it is concerning that the main data structure being shared, the storage itself, is frequently mutated while threads are active in various states. It is not a robust design.

I would consider keeping the storage stable and evaluating the LRU a different way.

Thanks for the response. You make a good point - it is possible for a iterator to be invalidated while others are waiting. Do you have any ideas on how to get around this? — Tom, Aug 26 '21 at 14:28
In my answer, I mumbled something about freshening the item_iterator from lookupmap after cv::wait returns. Overall, I like the technique of being unlocked while downloading. It solves the biggest bottleneck and what makes the question not a duplicate. Still, some Ideas for different data structures can be found in related questions like https://stackoverflow.com/questions/2504178/lru-cache-design — WilliamClements, Aug 26 '21 at 15:33

Is this a truly Thread-Safe LRU Cache Design with Minimal Blocking?

1 Answers1