2

I have a function which performs an expensive operation and is called often; but, the operation only needs to be performed once - its result could be cached.

I tried making an infinite generator but I didn't get the results I expected:

>>> def g():
...     result = "foo"
...     while True:
...         yield result
... 
>>> g()
<generator object g at 0x1093db230>    # why didn't it give me "foo"?

Why isn't g a generator?

>>> g
<function g at 0x1093de488>

Edit: it's fine if this approach doesn't work, but I need something which performs exactly like a regular function, like so:

>>> [g() for x in range(3)]
["foo", "foo", "foo"]
2rs2ts
  • 10,662
  • 10
  • 51
  • 95
  • Why not just cache the result in a variable or a function attribute? Or check out one of the Python memoization recipes you can find through Google. – user2357112 Jul 31 '13 at 18:08
  • @user2357112 Noted. Both were mentioned in answers. – 2rs2ts Jul 31 '13 at 18:19
  • I can't figure out your name, is there something to it i'm not getting? Too rs to ts?, this is 2complicated4me – Stephan Jul 31 '13 at 18:24
  • 1
    @Stephan Two r's, two t's. My surname is commonly misspelled, so I find myself uttering that often. – 2rs2ts Jul 31 '13 at 18:33

5 Answers5

6

g() is a generator function. Calling it returns the generator. You then need to use that generator to get your values. By looping, for example, or by calling next() on it:

gen = g()
value = next(gen)

Note that calling g() again will calculate the same value again and produce a new generator.

You may just want to use a global to cache the value. Storing it as an attribute on the function could work:

def g():
    if not hasattr(g, '_cache'):
        g._cache = 'foo'
    return g._cache
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
4

A better way: @functools.lru_cache(maxsize=None). It's been backported to python 2.7, or you could just write your own.

I am occasionally guilty of doing:

def foo():
    if hasattr(foo, 'cache'):
        return foo.cache

    # do work
    foo.cache = result
    return result
Katriel
  • 120,462
  • 19
  • 136
  • 170
  • The `functools.lru_cache` takes a rather big performance penalty. Unless you have need for a cache that tracks multiple values, needs cache management and values per function parameters, I'd not bother with this. – Martijn Pieters Jul 31 '13 at 18:12
  • See [Why are uncompiled, repeatedly used regexes so much slower in Python 3?](http://stackoverflow.com/q/14756790) for an example of the LRU cache being mis-applied. The Python devs undid that change in the next point release. – Martijn Pieters Jul 31 '13 at 18:13
  • I'd use `functools.lru_cache` if I could use external packages (just for this purpose). It seems very cool (and dead simple). @MartijnPieters can you explain why? Couldn't I just use `maxsize=1`? – 2rs2ts Jul 31 '13 at 18:14
  • @2rs2ts: `functools` is not an external package; it's part of Python. Also, `maxsize=1` may save a tiny bit of memory, but it's not really going to speed anything up over the default. If anything, using an unbounded cache (`maxsize=None`) might speed things up a bit. – abarnert Jul 31 '13 at 18:23
  • @2rs2ts: See the linked question; there is a overhead to using the LRU cache; a performance penalty on *every* call to your function. Use it only if you need the advanced functionality. – Martijn Pieters Jul 31 '13 at 18:24
  • @MartijnPieters Noted. – 2rs2ts Jul 31 '13 at 18:27
  • @abarnert Trying `from functools import lru_cache` gave me an ImportError and using `@functools.lru_cache()` gave me an `AttributeError: 'module' object has no attribute lru_cache` so I'm left to believe it's not in my Python 2.7.2 install. – 2rs2ts Jul 31 '13 at 18:29
  • It is not; it was added in Python 3; this answer links you to a backport you can install for 2.7. – Martijn Pieters Jul 31 '13 at 18:34
  • @MartijnPieters Precisely what I was getting at. :P – 2rs2ts Jul 31 '13 at 18:34
2

Here's a dead-simple caching decorator. It doesn't take into account any variations in parameters, it just returns the same result after the first call. There are fancier ones out there that cache the result for each combination of inputs ("memoization").

import functools

def callonce(func):

    result = []

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        if not result:
            result.append(func(*args, **kwargs))
        return result[0]

    return wrapper

Usage:

@callonce
def long_running_function(x, y, z):
    # do something expensive with x, y, and z, producing result
    return result

If you would prefer to write your function as a generator for some reason (perhaps the result is slightly different on each call, but there's still a time-consuming initial setup, or else you just want C-style static variables that allow your function to remember some bit of state from one call to the next), you can use this decorator:

import functools

def gen2func(generator):

    gen = []

    @functools.wraps(generator)
    def wrapper(*args, **kwargs):
        if not gen:
            gen.append(generator(*args, **kwargs))
        return next(gen[0])

    return wrapper

Usage:

@gen2func
def long_running_function_in_generator_form(x, y, z):
    # do something expensive with x, y, and z, producing result
    while True: 
        yield result
        result += 1    # for example

A Python 2.5 or later version that uses .send() to allow parameters to be passed to each iteration of the generator is as follows (note that **kwargs are not supported):

import functools

def gen2func(generator):

    gen = []

    @functools.wraps(generator)
    def wrapper(*args):
        if not gen:
            gen.append(generator(*args))
            return next(gen[0])
        return gen[0].send(args)

    return wrapper

@gen2func
def function_with_static_vars(a, b, c):
    # time-consuming initial setup goes here
    # also initialize any "static" vars here
    while True:
        # do something with a, b, c
        a, b, c = yield        # get next a, b, c
kindall
  • 178,883
  • 35
  • 278
  • 309
  • 1
    Is there a reason you make `result` a list? – 2rs2ts Jul 31 '13 at 18:31
  • 2
    So it can be modified by the wrapper function (also, it makes testing to see if the function has already been called simple). You could also use `nonlocal` in Python 3.x. – kindall Jul 31 '13 at 18:37
  • Why does it have to be a list in order to be modified? Why can't it be `result = None`, and later `result = func(*args, **kwargs)`? – 2rs2ts Jul 31 '13 at 18:39
  • 1
    Because in Python before 3.0, functions can't rebind the names of variables defined in outer scopes (except for globals when using the `global` keyword). It is an error. – kindall Jul 31 '13 at 18:40
  • Oh, yeah, I should have known that... Thanks! – 2rs2ts Jul 31 '13 at 18:44
  • This works like a charm, with the added benefit that I don't have to copy-paste the `hasattr` hack and make sure that the name of the function in it is the same as the name of the function using it should I desire to do this for many functions. Tested it with `datetime.datetime.utcnow()` to make sure, and it kept returning the exact same `datetime.datetime` object :) This works pretty well for me because my function doesn't take any arguments. – 2rs2ts Jul 31 '13 at 18:52
  • Thanks, I also added a more general version that lets you write your function as a generator while retaining a function-style caller interface. – kindall Jul 31 '13 at 19:04
  • I've been trying to repurpose this solution to use the `decorator` module but I'm having some trouble. Namely, I've replaced `@functools.wraps(func)` with either `@decorator` or `@decorator(func)` (wasn't really sure what to do...) and I get `IndexError: list index out of range` on line `213` of `decorator`, which is "`fun = getfullargspec(callerfunc).args[0]`. If the function takes arguments, I get `TypeError: foo() takes exactly 1 argument (2 given)`. What gives? – 2rs2ts Aug 01 '13 at 21:49
  • 1
    Decorators have to be written differently with `decorator`. Basically you don't write the decorator itself (`decorator` writes that for you) and accept the function being wrapped as the first parameter of the wrapper. This may not really be suitable for use with `decorator` because the `result` variable must be declared in the decorator. – kindall Aug 01 '13 at 22:35
1

A better option would be to use memoization. You can create a memoize decorator that you can use to wrap any function that you want to cache the results for. You can find some good implementations here.

FastTurtle
  • 2,301
  • 19
  • 19
  • Very awesome. Elaborate for my purposes, but if I require the expressive power I'll end up using a memoization recipe over the function attribute hack. – 2rs2ts Jul 31 '13 at 18:21
  • Re-reading this link I noticed that [this recipe](http://wiki.python.org/moin/PythonDecoratorLibrary#Alternate_memoize_as_nested_functions) is very similar to @kindall's answer. – 2rs2ts Jul 31 '13 at 21:17
0

You can also leverage Beaker and its cache.

Also it has a tons of extensions.

DevLounge
  • 8,313
  • 3
  • 31
  • 44