C++ simple cache design for function output

Question

I assume it's a quite frequent problem with well-known solutions, which I wasn't able to find. So I'm seeking advice here.

Problem Statement

Consider the following setting:

class A; // some class

const A f(const A&); // an _expensive_ function

void do_stuff()
{
    A a;

    a.modify(...);

    do_stuff1(f(a));  // compute f(a)
    do_stuff2(f(a));  // use cached value of f(a)

    a.modify(...);

    do_stuff3(f(a));  // recompute f(a)
}

I would like the return value of f(a) to be cached between the first and second calls, but to be discarded after the second call to a.modify(). EDIT: In practice, the calls to f(a) will be in different scopes.

Here are the pieces of solutions I've explored, for what it's worth.

Solution 1: Central Cache

Using time stamps

I can imagine a simple solution involving adding a time stamp to class A that function f can check and decide if it needs to update its cached result, stored somewhere in a central cache. I guess this also implies changing the signature of f to:

const A& f(const A&);

Problem 1: with a central cache, we need a mechanism to destroy the cached result of f(a) when a is destroyed.

Using hash codes

Aside from Problem 1, this seems simple enough. But it gets complicated when A stands for std::vector<...>. I guess dynamic polymorphism should be excluded here. So we forget about adding a time stamp to a subclass of std::vector<...> and all the overriding that it would imply. However, we could compute some hash code or UUID based on the contents of a --- assuming that it is much cheaper than computing f(a) --- and base the central cache on these hash codes. But we're facing Problem 1 again.

Solution 2: Coupled Objects

I still haven't found how to implement this, but the idea is to have a notify the cache for f(a) when a is written to or destroyed, but not when it is merely read from. I can't figure how to do that without dynamic polymorphism, and without slowing down single-element accesses using operator[] or iterators by sending notifications to the cache for each modified element.

Problem 2: find a mechanism of delimiting sets of changes to a to invalidate the cache only once for each set of changes.

I've thought of proxies to enable write access on a (inspired by the concept of mutex), but couldn't come up with any working code.

Any ideas?

Can't you store the `return` value and reuse it ? If you can change `f(const A&)`, then you can pass a temporary object of `A` as a 2nd parameter and use the same. — iammilind, Jul 01 '11 at 06:27
@Munger, you mentioned that `A` could be anything, even `std::vector<...>`. Is `A` anything? Is this theoretical or practical question? What is `A` exactly? — Dialecticus, Jul 01 '11 at 08:24
@Dialecticus: for the problem at hand, it's a `std::vector<...>`. But I recall facing that problem for other types of objects, so I wrote `A` to include these cases as well. Perhaps I shouldn't. — Munger, Jul 01 '11 at 12:52
Is it much cheaper to compare two vectors than to call the function once? If those two operations cost roughly the same then there's not much room for optimization. — Dialecticus, Jul 01 '11 at 13:59
@Dialectus: Yes, it is much cheaper to compare vectors than to call the function. — Munger, Jul 01 '11 at 17:43

score 4 · Accepted Answer · answered Jul 01 '11 at 06:34

4

I've done similar stuff with interfaces like this:

class F
{
public:
 virtual int f(int a)=0;
};

class Cache : public F
{
public:
   Cache(F &f) : f(f) { }
   int f(int a) { /*caching logic here, calls f.f() if not found from cache */ }
   F &f;
};

class Impl : public F
{
   int f(int a) { /* real implementation here */ }
};

Then it's just deciding where to use the caching logic:

   Impl i; 
   Cache c(i);
   c.f(10); // put to cache with key 10
   c.f(10); // found from cache
   c.f(11); // put to cache with key 11

answered Jul 01 '11 at 06:34

tp1

1,197
10
17

I like that. But I still have to solve Problem 1 above. `f` will be called on many different objects, so it is a bad idea to keep cached values of `f(old_a)` for all `old_a`'s that have been destroyed. – Munger Jul 01 '11 at 06:45
@munger destroying the cached values can happen in two different ways: either Cache object goes out of scope/gets destroyed/replaced, or you can add a member function to Cache which lets you explicitly remove some cached values or a range of values from the cache. The cache can have any member functions you like, so it's possible to do these things with this design. – tp1 Jul 01 '11 at 06:51
I've managed to do what I need with a similar approach using boost::function, and by calling the cached version where the function is likely to be called more than once, and the uncached version elsewhere. In the cache map, I replaced the vectors, which require a lot of storage, with SHA-1 hash values. – Munger Jul 05 '11 at 20:28

score 1 · Answer 2 · answered Jul 01 '11 at 06:27

1

Can't you just do this:

const A &cacheA = f(a);
do_stuff1(cacheA);  // compute f(a)
do_stuff2(cacheA);  // use cached value of f(a)

answered Jul 01 '11 at 06:27

Nicol Bolas

449,505
63
781
982

Probably not. The shown code is a conceptual example, and the two calls to `f(a)` will probably be in completely unrelated parts of the code. – Roland Illig Jul 01 '11 at 06:34
No, the code I've given above is overly simplified. The calls to f(a) actually occur in mutually exclusive scopes. – Munger Jul 01 '11 at 06:34
@Munger: Then you should edit your question and make it more clear that the calls will be in different scopes. – Nicol Bolas Jul 01 '11 at 06:41

score 1 · Answer 3 · edited May 23 '17 at 11:48

1

I'm probably missing some important detail here, but can't you just use a LRU cache for this purpose?

edited May 23 '17 at 11:48

Community

1
1

answered Jul 01 '11 at 07:15

Jeremy Friesner

70,199
15
131
234

Mainly because I already know the optimal life time of the cached objects, i.e. until destruction of the original objects. Since in the specific application I have in mind, most of them are going to have a really short life time, a LRU cache would be a waste of memory. Good point, though. – Munger Jul 01 '11 at 12:23

score 1 · Answer 4 · answered Jul 01 '11 at 07:55

1

make f a member of A. Then you can decide in the instance of A if you can reuse the cached result or not.

answered Jul 01 '11 at 07:55

Tobias Langner

10,634
6
46
76

In the specific application I have in mind, `f(a)` is just an alternative representation of `a`. So, for design reasons, what you suggest is the way I'd like to go, and my first idea. This raises two questions: (1) how to subclass std::vector that has no virtual members, and (2) when to set the *modified* flag of the object. – Munger Jul 01 '11 at 12:27