Understanding LRU Algorithm

Question

I am trying to understand LRU and its making no sense to me. If I understand it then it will be easier for me to code. Can someone walk me through the steps? Like,

If the reference string you are currently at is in the array then do you increment to the next space?
When you check to see if somethings in the buffer then do we need to scan where we are at or the whole array?

I just cant seem to follow along. How is it keeping track of least recently used? Shouldnt the least recently used be the one after the index your at?

enter image description here

What references have you used? Is there a web site or a book? — Vaughn Cato, Nov 12 '12 at 03:16

score 5 · Answer 1 · answered Nov 12 '12 at 03:20

5

An LRU is typically put in a list. When an item is accessed, remove it from the list (if it's there), and push it to the back. The back of the list is the Most Recently Used. Because you're continually moving used items to the back of the list, the least used ones end up naturally at the front of the list. When there's not enough room, you pop from the front.

answered Nov 12 '12 at 03:20

Cory Nelson

29,236
5
72
110

1

IF you want O(1) access (get) time, then you need a Map to index the Double Linked List, otherwise you need to scan the list ( O(n) ) to find the item. See the following link for a complete code example: http://www.sureinterview.com/shwqst/124003 – Tony Rad Nov 12 '12 at 03:26
2

@TonyRad: This is correct (if map means _hash_ map, the normal C++SL map has _O(log(n))_), but note that if a cache is working well, you will ask for the most recent (or 2nd, 3rd most recent) item in the usual case, which is (almost) as good as O(1). On the other hand, a map (or hash) has a non-neglegible _k_ associated with the _O_. Which means it's quite possibly slower despite lower algorithmic complexity. If the above assumptions on temporal locality doesn't hold, the list-based cache will perform badly, but _any cache_ will perform extremely badly in that case. – Damon Nov 12 '12 at 16:38
(In fact, accessing the 2nd or 3rd item in a list is not _almost_ as good as O(1), it is O(1), just with a bigger constant.) Either way, what I'm trying to say is that if implemented and used correctly, caches are normally quite small (so big-O is not all that critical, there's no real difference), and are normally accessed with high temporal coherence. If one can afford to hold most/all of the data in a cache, there's no real need for a cache or an eviction strategy. Similarly, if the usage pattern doesn't result in frequent hits, cache is the wrong tool. – Damon Nov 12 '12 at 16:46
@Demon, thx and +1 for explanations. Yes,I was thinking about hashmap implem. of Map ADT, but also BST impl. are, theoretically, better than linear scanning the list). I'm just still thinking about the following sentences: "...Which means it's quite possibly slower despite lower algorithmic complexity. If the above assumptions on temporal locality doesn't hold, the list-based cache will perform badly, but any cache will perform extremely badly in that case." which are not clear to me. In fact you are assuming that there will be many collisions or that BST impl will be not a good solution. – Tony Rad Nov 12 '12 at 17:06
I avoided mentioning a hash map to keep the explanation simple (after all he asked about the LRU algorithm, not about how to build a cache). If we're talking caches in C++, I'd use boost intrusive containers (unordered_map and list) for highest efficiency. – Cory Nelson Nov 12 '12 at 18:50

Edwin Buck · Accepted Answer · 2012-11-12T15:48:17.943

Store two items for each "item". The value (of course) and the "time" which is an ever-increasing integer.

So, using your data, assuming that 1 through 4 were accessed in order:

(1, 1), (2, 2), (3, 3), (4, 4)

access "1" (sorted by time for clarity, time = 5)

(2, 2), (3, 3), (4, 4), (1, 5)

access "2" (sorted by time for clarity, time = 6)

(3, 3), (4, 4), (1, 5), (2, 6)

access "5" which will not be found, causing a cache miss. To find the "space" to store the "5", we need to flush the Least Recently Used (LRU). This means dropping "3". Note that time = 7.

(4, 4), (1, 5), (2, 6), (5, 7)

access "1" (sorted by time for clarity, time = 8)

(4, 4), (2, 6), (5, 7), (1, 8)

access "2" (sorted by time for clarity, time = 9)

(4, 4), (5, 7), (1, 8), (2, 9)

access "3" which will not be found, causing a cache miss. To find the "space" to store the "3", we need to flush the Least Recently Used (LRU). This means dropping "4". Note that time = 10.

(5, 7), (1, 8), (2, 9), (3, 10)

access "4" which will not be found, causing a cache miss. To find the "space" to store the "4", we need to flush the Least Recently Used (LRU). This means dropping "5". Note that time = 11.

(1, 8), (2, 9), (3, 10), (4, 11)

access "5" which will not be found, causing a cache miss. To find the "space" to store the "5", we need to flush the Least Recently Used (LRU). This means dropping "1". Note that time = 12.

(2, 9), (3, 10), (4, 11), (5, 12)

Now that you know how the item to be flushed was selected, and have a working example, the flushing of the item without moving it around in the array is up to you to implement.

--- Edited With additional explanation ---

Ok, the idea of showing stuff in list form seems to have raised a few questions, so I'll show it in array form

Access 1, cache miss, empty slot available, store in first available slot

Value   Age
1       1
---     ---
---     ---
---     ---

Access 2, cache miss, empty slot available, store in first available slot

Value   Age
1       1
2       2
---     ---
---     ---

Access 3, cache miss, empty slot available, store in first available slot

Value   Age
1       1
2       2
3       3
---     ---

Access 4, cache miss, empty slot available, store in first available slot

Value   Age
1       1
2       2
3       3
4       4

Access 1, cache hit, update access time

Value   Age
1       5
2       2
3       3
4       4

Access 2, cache hit, update access time

Value   Age
1       5
2       6
3       3
4       4

Access 5, cache miss, no available empties, discard and replace "least recently used"

Value   Age
1       5
2       6
5       7
4       4

Access 1, cache hit, update access time

Value   Age
1       8
2       6
5       7
4       4

Access 2, cache hit, update access time

Value   Age
1       8
2       9
5       7
4       4

Access 3, cache miss, no available empties, discard and replace "least recently used"

Value   Age
1       8
2       9
5       7
3       10

Access 4, cache miss, no available empties, discard and replace "least recently used"

Value   Age
1       8
2       9
4       11
3       10

Access 5, cache miss, no available empties, discard and replace "least recently used"

Value   Age
5       12
2       9
4       11
3       10

Little confused because the final answer in the picture is 5,2,4,3 but you ended up with 2,3,4,5 — , Nov 12 '12 at 04:39
Before i add my first number to the array i need to check to see if its there if not shift array and add to the back? In essence then by the time i get to the 4 it will be at the back? — , Nov 12 '12 at 04:56
couldnt i just pop the top or delete the index as needed and add at the end? Using vectors or std list? — , Nov 12 '12 at 06:42
My lists were sorted by "age / counter" in order to get a better view of the actual algorithim. If you don't sort the lists, but do your replacements "in place" then you get the exact same output. Using a "std:list" won't make things easier if you don't know the algorithm. Just "doing it some other way" won't guarantee you're following a LRU algorithm. Perhaps another way will work, perhaps not. — Edwin Buck, Nov 12 '12 at 15:39
The key to this algorithim is to store an "extra" piece of information that allows you to find the least recently used item. I have used a "counter" that counts up to represent access time, but you could use timestamps, or a counter that counts down, or a linked list that gets rearranged, or anything. Understanding the algorithm has little to do with understanding the code. You can use LRU to determine who gets reserved tables in a restaurant, or any number of other items. Basically the one you've ignored the longest is the one to go. — Edwin Buck, Nov 12 '12 at 15:54
Omg this is perfect. I understand both ways. I think the second way using 2 arrays could be easy. Thank you for taking the time to help me. — , Nov 12 '12 at 18:40
I have to learn one more algorithm. The program is 3 parts. I posted it here to stack overflow can you do what you did for this post? here it is called understanding OPT algorithm. http://stackoverflow.com/questions/13350386/understanding-opt-algorithm — , Nov 12 '12 at 19:43

score 1 · Answer 3 · answered Jan 24 '13 at 14:27

This is my simple sample c++ implementation for LRU cache, with the combination of hash(unordered_map), and list. Items on list have key to access map, and items on map have iterator of list to access list. So, every method calling is complexity O(1).

#include <list>
#include <unordered_map>
#include <assert.h>

using namespace std;

template <class KEY_T, class VAL_T> class LRUCache{
private:
        list< pair<KEY_T,VAL_T> > item_list;
        unordered_map<KEY_T, decltype(item_list.begin()) > item_map;
        size_t cache_size;
private:
        void clean(void){
                while(item_map.size()>cache_size){
                        auto last_it = item_list.end(); last_it --;
                        item_map.erase(last_it->first);
                        item_list.pop_back();
                }
        };
public:
        LRUCache(int cache_size_):cache_size(cache_size_){
                ;
        };

        void put(const KEY_T &key, const VAL_T &val){
                auto it = item_map.find(key);
                if(it != item_map.end()){
                        item_list.erase(it->second);
                        item_map.erase(it);
                }
                item_list.push_front(make_pair(key,val));
                item_map.insert(make_pair(key, item_list.begin()));
                clean();
        };
        bool exist(const KEY_T &key){
                return (item_map.count(key)>0);
        };
        VAL_T get(const KEY_T &key){
                assert(exist(key));
                auto it = item_map.find(key);
                item_list.splice(item_list.begin(), item_list, it->second);
                return it->second->second;
        };

};

Understanding LRU Algorithm

3 Answers3