Minimizing the number of costly function calls when the parameters remain the same (python)

Question

Suppose that there is a function costly_function_a(x) such that:

it is very costly in terms of execution time;
it returns the same output whenever the same x is fed to it; and
it does not perform "additional tasks" besides returning an output.

In these conditions, instead of calling the function twice in a row with the same x, we can store the result in a temporary variable, then use that variable to do these computations.

Now suppose that there are some functions (f(x), g(x) and h(x) in the example below) that call costly_function_a(x), and that some of these functions may call each others (in the example below, g(x)and h(x) both call f(x)). In this case, using the simple method mentioned above still results in repeated calls to costly_function_a(x) with the same x (see OkayVersion below). I did found a way to minimize the number of calls, but it is "ugly" (see FastVersion below). Any ideas of a better way to do this?

#Dummy functions representing extremely slow code.
#The goal is to call these costly functions as rarely as possible.
def costly_function_a(x):
    print("costly_function_a has been called.")
    return x #Dummy operation.
def costly_function_b(x):
    print("costly_function_b has been called.")
    return 5.*x #Dummy operation.

#Simplest (but slowest) implementation.
class SlowVersion:
    def __init__(self,a,b):
        self.a = a
        self.b = b
    def f(self,x): #Dummy operation.
        return self.a(x) + 2.*self.a(x)**2
    def g(self,x): #Dummy operation.
        return self.f(x) + 0.7*self.a(x) + .1*x
    def h(self,x): #Dummy operation.
        return self.f(x) + 0.5*self.a(x) + self.b(x) + 3.*self.b(x)**2

#Equivalent to SlowVersion, but call the costly functions less often.
class OkayVersion:
    def __init__(self,a,b):
        self.a = a
        self.b = b
    def f(self,x): #Same result as SlowVersion.f(x)
        a_at_x = self.a(x)
        return a_at_x + 2.*a_at_x**2
    def g(self,x): #Same result as SlowVersion.g(x)
        return self.f(x) + 0.7*self.a(x) + .1*x
    def h(self,x): #Same result as SlowVersion.h(x)
        a_at_x = self.a(x)
        b_at_x = self.b(x)
        return self.f(x) + 0.5*a_at_x + b_at_x + 3.*b_at_x**2

#Equivalent to SlowVersion, but calls the costly functions even less often.
#Is this the simplest way to do it? I am aware that this code is highly
#redundant. One could simplify it by defining some factory functions...
class FastVersion:
    def __init__(self,a,b):
        self.a = a
        self.b = b
    def f(self, x, _at_x=None): #Same result as SlowVersion.f(x)
        if _at_x is None:
            _at_x = dict()
        if 'a' not in _at_x:
            _at_x['a'] = self.a(x)
        return _at_x['a'] + 2.*_at_x['a']**2
    def g(self, x, _at_x=None): #Same result as SlowVersion.g(x)
        if _at_x is None:
            _at_x = dict()
        if 'a' not in _at_x:
            _at_x['a'] = self.a(x)
        return self.f(x,_at_x) + 0.7*_at_x['a'] + .1*x
    def h(self,x,_at_x=None): #Same result as SlowVersion.h(x)
        if _at_x is None:
            _at_x = dict()
        if 'a' not in _at_x:
            _at_x['a'] = self.a(x)
        if 'b' not in _at_x:
            _at_x['b'] = self.b(x)
        return self.f(x,_at_x) + 0.5*_at_x['a'] + _at_x['b'] + 3.*_at_x['b']**2

if __name__ == '__main__':

    slow = SlowVersion(costly_function_a,costly_function_b)
    print("Using slow version.")
    print("f(2.) = " + str(slow.f(2.)))
    print("g(2.) = " + str(slow.g(2.)))
    print("h(2.) = " + str(slow.h(2.)) + "\n")

    okay = OkayVersion(costly_function_a,costly_function_b)
    print("Using okay version.")
    print("f(2.) = " + str(okay.f(2.)))
    print("g(2.) = " + str(okay.g(2.)))
    print("h(2.) = " + str(okay.h(2.)) + "\n")

    fast = FastVersion(costly_function_a,costly_function_b)
    print("Using fast version 'casually'.")
    print("f(2.) = " + str(fast.f(2.)))
    print("g(2.) = " + str(fast.g(2.)))
    print("h(2.) = " + str(fast.h(2.)) + "\n")

    print("Using fast version 'optimally'.")
    _at_x = dict()
    print("f(2.) = " + str(fast.f(2.,_at_x)))
    print("g(2.) = " + str(fast.g(2.,_at_x)))
    print("h(2.) = " + str(fast.h(2.,_at_x)))
    #Of course, one must "clean up" _at_x before using a different x...

The output of this code is:

Using slow version.
costly_function_a has been called.
costly_function_a has been called.
f(2.) = 10.0
costly_function_a has been called.
costly_function_a has been called.
costly_function_a has been called.
g(2.) = 11.6
costly_function_a has been called.
costly_function_a has been called.
costly_function_a has been called.
costly_function_b has been called.
costly_function_b has been called.
h(2.) = 321.0

Using okay version.
costly_function_a has been called.
f(2.) = 10.0
costly_function_a has been called.
costly_function_a has been called.
g(2.) = 11.6
costly_function_a has been called.
costly_function_b has been called.
costly_function_a has been called.
h(2.) = 321.0

Using fast version 'casually'.
costly_function_a has been called.
f(2.) = 10.0
costly_function_a has been called.
g(2.) = 11.6
costly_function_a has been called.
costly_function_b has been called.
h(2.) = 321.0

Using fast version 'optimally'.
costly_function_a has been called.
f(2.) = 10.0
g(2.) = 11.6
costly_function_b has been called.
h(2.) = 321.0

Note that I do not want to "store" the results for all values of x used in the past (because this would require too much memory). Moreover, I do not want to have a function returning a tuple of the form (f,g,h) because there are cases where I only want f (so there is no need to evaluate costly_function_b).

Given that you don't want memoization, what sort of answer or improvement do you expect? — Simeon Visser, Dec 06 '13 at 23:14
I do not want memorization of the result for all the past values of `x`. However, locally tracking for "the current `x`" is fine (as is done in this example). — user1661473, Dec 06 '13 at 23:23

Martijn Pieters · Accepted Answer · 2013-12-06T23:57:41.163

11

What you are looking for is a LRU cache; only most-recently used items are cached, limiting memory usage to balance invocation cost with memory requirements.

As your costly function is invoked with different values for x, up to a number of return values (per unique x value) is cached, with the least-recently used cache results discarded when the cache is full.

As of Python 3.2, the standard library comes with a decorator implementation: @functools.lru_cache():

from functools import lru_cache

@lru_cache(16)  # cache 16 different `x` return values
def costly_function_a(x):
    print("costly_function_a has been called.")
    return x #Dummy operation.

@lru_cache(32)  # cache 32 different `x` return values
def costly_function_b(x):
    print("costly_function_b has been called.")
    return 5.*x #Dummy operation.

A backport is available for earlier versions, or pick one of the other available libraries that can handle LRU caches avaible on PyPI.

If you only ever need to cache one most recent item, create your own decorator:

from functools import wraps

def cache_most_recent(func):
    cache = [None, None]
    @wraps(func)
    def wrapper(*args, **kw):
        if (args, kw) == cache[0]:
            return cache[1]
        cache[0] = args, kw
        cache[1] = func(*args, **kw)
        return cache[1]
    return wrapper

@cache_most_recent
def costly_function_a(x):
    print("costly_function_a has been called.")
    return x #Dummy operation.

@cache_most_recent
def costly_function_b(x):
    print("costly_function_b has been called.")
    return 5.*x #Dummy operation.

This simpler decorator has less overhead than the more featureful functools.lru_cache().

edited Dec 06 '13 at 23:57

answered Dec 06 '13 at 23:20

Martijn Pieters

1,048,767
296
4,058
3,343

Superior to mine, though I wasn't thinking of a finite cache. I suppose if it's known there will be only a few it would be optimal. +1 – Plasmarob Dec 06 '13 at 23:23
woah I never knew there was something in the standard library that did this, I always implemented it myself.. time to switch to 3.2 I guess – qwwqwwq Dec 06 '13 at 23:24
@qwwqwwq: of course, the cache has a performance penalty too; when it was applied to the `re` module [perfomance actually suffered](http://stackoverflow.com/questions/14756790/why-are-uncompiled-repeatedly-used-regexes-so-much-slower-in-python-3) as the naive pattern cache used before was much faster. That change was reverted. – Martijn Pieters Dec 06 '13 at 23:28
Very interesting. Is this still optimal for @lru_cache(1)? – user1661473 Dec 06 '13 at 23:29
1

@user1661473: provided the cost of the function itself is higher than the (slight) performance penalty cache management incurs, then I definitely would say so. Note that a cache size in powers of 2 is most optimal; use `@lru_cache(2)` at the very least. – Martijn Pieters Dec 06 '13 at 23:31
Hehe, `1` is `2**0`! I will probably accept this answer tomorrow. – user1661473 Dec 06 '13 at 23:34
@user1661473: True, but why limit yourself to just one cached item? Is the memory impact *that* great? – Martijn Pieters Dec 06 '13 at 23:34
@MartijnPieters: Well, I don't need more than 1 cached item for my particular application of interest. In short, I iterate over different values of `x`. For each one, I have to evaluate a hierarchy of functions, of which only one really matters (but that one function depends on a lot of others). Hence, I can store that small answer, then proceed with the next `x`. There is no need for the values at the previous `x`. – user1661473 Dec 06 '13 at 23:41

score 1 · Answer 2 · edited May 23 '17 at 12:03

I am accepting @MartijnPieters' solution because it is probably the right way to do it for 99% of the people that will get a problem similar to mine. However, in my very particular case, I only need a "cache of 1", so the fancy @lru_cache(1) decorator is slightly overkill. I ended up writing my own decorator (thanks to this awesome stackoverflow answer), which I provide below. Be warned that I am new to Python, so this code may not be perfect.

from functools import wraps

def last_cache(func):
    """A decorator caching the last value returned by a function.

    If the decorated function is called twice (or more) in a row with exactly
    the same parameters, then this decorator will return a cached value of the
    decorated function's last output instead of calling it again. This may
    speed up execution if the decorated function is costly to call.

    The decorated function must respect the following conditions:
    1.  Repeated calls return the same value if the same parameters are used.
    2.  The function's only "task" is to return a value.

    """
    _first_call = [True]
    _last_args = [None]
    _last_kwargs = [None]
    _last_value = [None]
    @wraps(func)
    def _last_cache_wrapper(*args, **kwargs):
        if _first_call[0] or (args!=_last_args[0]) or (kwargs!=_last_kwargs[0]):
            _first_call[0] = False
            _last_args[0] = args
            _last_kwargs[0] = kwargs
            _last_value[0] = func(*args, **kwargs)
        return _last_value[0]
    return _last_cache_wrapper

Minimizing the number of costly function calls when the parameters remain the same (python)

2 Answers2