23

I have a function called runquery that makes calls to a database and then yields the rows, one by one. I wrote a memoize decorator (or more accurately, I just stole one from this stackoverflow question) but on subsequent calls it just yields an empty sequence, presumably because a generator's values can only be yielded once.

How could I modify the memoization decorator that works for Python generators? I realise I will need to store it in memory at some point but I'd like to handle this within the decorator and not modify the original function.

The current code of the memoization function is:

def memoized(f):
    # Warning: Doesn't work if f yields values
    cache={}
    def ret(*args):
        if args in cache:
            return cache[args]
        else:
            answer=f(*args)
            cache[args]=answer
            return answer
    return ret
Community
  • 1
  • 1
bryn
  • 1,246
  • 1
  • 13
  • 21

4 Answers4

21

I realise this is somewhat of an old question, but for those who want a full solution: here's one, based on jsbueno's suggestion:

from itertools import tee
from types import GeneratorType

Tee = tee([], 1)[0].__class__

def memoized(f):
    cache={}
    def ret(*args):
        if args not in cache:
            cache[args]=f(*args)
        if isinstance(cache[args], (GeneratorType, Tee)):
            # the original can't be used any more,
            # so we need to change the cache as well
            cache[args], r = tee(cache[args])
            return r
        return cache[args]
    return ret
Jasmijn
  • 9,370
  • 2
  • 29
  • 43
  • 1
    Thanks for your illustration! I took ages to understand the way to use `tee`. __But__ I think there is a problem when checking the instance: you should test against `collections.Iterable`, as testing against `types.GeneratorType` works only once: when returning the cached iterator (`iterator.tee` object) at the third call to the function, the cache will return an exhausted iterator. – Joël Jan 14 '14 at 10:34
  • 1
    You are right! However, testing against `collections.Iterable` would be wrong too, since lists and strings etc. are iterables as well. I changed the sets to `(GeneratorType, _tee)`, so it works for tee objects as well. – Jasmijn Jan 14 '14 at 15:42
  • 1
    Yes, it's definitly a bit overkill, but, under Python 2.7, there is no `_tee` object in `itertools`, and `itertools.tee` function returns `itertools.tee` objects. Obviously, this is not their class, so testing against `itertools.tee` makes no sense (and also does not work), hence my proposal with `collections.Iterable`. Under which python version have you tested your code? – Joël Jan 15 '14 at 08:54
  • 1
    Huh, okay. I tested against Python 3.3. I found a solution that works both for 3.3 and 2.7. I'll update my answer right away. – Jasmijn Jan 15 '14 at 20:06
  • Wow, nice :) Thank you for your effort! – Joël Jan 16 '14 at 08:45
10
from itertools import tee

sequence, memoized_sequence = tee (sequence, 2)

Done.

It is easier for generators because the standard lib has this "tee" method!

jsbueno
  • 99,910
  • 10
  • 151
  • 209
  • 4
    Could you please edit your answer to show how to integrate this with the memoize function above? I'd like to type something like @memoize_generator above a function that yielded a sequence. – bryn Jan 03 '11 at 00:20
4

Yes. There's a decorator posted here. Take note that as the poster says, you lose some of the benefit of lazy evaluation.

def memoize(func):
    def inner(arg):
        if isinstance(arg, list):
            # Make arg immutable
            arg = tuple(arg)
        if arg in inner.cache:
            print "Using cache for %s" % repr(arg)
            for i in inner.cache[arg]:
                yield i
        else:
            print "Building new for %s" % repr(arg)
            temp = []
            for i in func(arg):
                temp.append(i)
                yield i
            inner.cache[arg] = temp
    inner.cache = {}
    return inner


@memoize
def gen(x):
    if not x:
        yield 0
        return

    for i in xrange(len(x)):
        for a in gen(x[i + 1:]):
            yield a + x[0]


print "Round 1"
for a in gen([2, 3, 4, 5]):
    print a

print
print "Round 2"
for a in gen([2, 3, 4, 5]):
    print a
moinudin
  • 134,091
  • 45
  • 190
  • 216
  • 1
    +1. It doesn't support interleaving access to the memoized result, though. – Martin v. Löwis Dec 30 '10 at 22:58
  • This is fine, but has room for improvement. Make *args the decorator parameters: much more usefull than one mandatory argument. And add the list to the cache _before_ the "for" - this will make room for using the iterator before the first run is complete. – jsbueno Dec 30 '10 at 23:31
0

Similar to the other answers, but simpler if you know that f is a generator:

def memoized_generator(f):
    cache = {}
    @functools.wraps(f)
    def wrapper(*args, **kwargs):
        k = args, frozenset(kwargs.items())
        it = cache[k] if k in cache else f(*args, **kwargs)
        cache[k], result = itertools.tee(it)
        return result
    return wrapper
Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135