Python func_dict used to memoize; other useful tricks?

Question

A Python function object has an attribute dictionary called func_dict which is visible from outside the function and is mutable, but which is not modified when the function is called. (I learned this from answers to a question I asked yesterday (#1753232): thanks!) I was reading code (at http://pythonprogramming.jottit.com/functional_programming) which memoized the computation of Fibonacci numbers and thought, "Why not use the func_dict attribute for memoizing?" It worked (see below; the output's at the end of the code.). It's a little like having a class property available but having the initialization code outside the object (in this case, not a class but a function).

I wonder what similar (or dissimilar) tricks can be done using this attribute?

def fib(n):
    if n in fib.cache:
        print "found fib.cache[%d] = %d: " %(n, fib.cache[n])
        return fib.cache[n]
    else:
        print "fib.cache[%d] = fib(%d) + fib(%d)" % (n, n-1, n-2)
        fib.cache[n] = fib(n-1) + fib(n-2)
        print "modified fib.cache: ", fib.cache
        return fib.cache[n]

fib.cache = {0:0, 1:1}

for x in range(7):
    print "==================>", x
    print fib( x)

"""
==================> 0
found fib.cache[0] = 0: 
0
==================> 1
found fib.cache[1] = 1: 
1
==================> 2
fib.cache[2] = fib(1) + fib(0)
found fib.cache[1] = 1: 
found fib.cache[0] = 0: 
modified fib.cache:  {0: 0, 1: 1, 2: 1}
1
==================> 3
fib.cache[3] = fib(2) + fib(1)
found fib.cache[2] = 1: 
found fib.cache[1] = 1: 
modified fib.cache:  {0: 0, 1: 1, 2: 1, 3: 2}
2
==================> 4
fib.cache[4] = fib(3) + fib(2)
found fib.cache[3] = 2: 
found fib.cache[2] = 1: 
modified fib.cache:  {0: 0, 1: 1, 2: 1, 3: 2, 4: 3}
3
==================> 5
fib.cache[5] = fib(4) + fib(3)
found fib.cache[4] = 3: 
found fib.cache[3] = 2: 
modified fib.cache:  {0: 0, 1: 1, 2: 1, 3: 2, 4: 3, 5: 5}
5
==================> 6
fib.cache[6] = fib(5) + fib(4)
found fib.cache[5] = 5: 
found fib.cache[4] = 3: 
modified fib.cache:  {0: 0, 1: 1, 2: 1, 3: 2, 4: 3, 5: 5, 6: 8}
8
"""

@behindthefall: Since there is no actual problem, the fib cache output is -- still -- just as worthless as the execution trace. What is the point? What do you want to know? Please focus very, very finely on some specific thing. This is not a blog. What is your question? — S.Lott, Nov 19 '09 at 03:27
@S.Lott - Sometimes a student doesn't know the question s/he's asking, so it is hard to put a very, very fine point on it. Alex Martelli, in the comments, sees the questions that I did not know enough to ask, and his response has been enlightening and has raised more questions in this student's mind, although, again, they do not have very, very fine points as yet. My original question, clearly stated, was, "Does anybody find the function directory useful (and here's an example, memoizing)?" And the output shows that it worked. — behindthefall, Nov 19 '09 at 03:52

score 7 · Accepted Answer · answered Nov 19 '09 at 03:12

7

Just be careful: the fib.cache trick only works if fib is indeed the name of the relevant function object in the scope that's active while it's executing (for example, when you decorate it as you have done, you must assign the starting value for the cache to the decorator wrapper, not to the decorated function -- and if it gets further decorated after that, things break).

This is a bit fragile compared to the standard memoization idiom:

def fib(n, _memo={0:1, 1:1}):
    if n in _memo:
        return _memo[n]
    else:
        _memo[n] = fib(n-1) + fib(n-2)
        return _memo[n]

or the decorator equivalent. The standard idiom's also faster (though not by much) -- putting them both in mem.py under names fib1 (the .cache-trick one, without prints and undecorated) and fib2 (my version), we see...:

$ python -mtimeit -s'import mem' 'mem.fib1(20)'
1000000 loops, best of 3: 0.754 usec per loop
$ python -mtimeit -s'import mem' 'mem.fib2(20)'
1000000 loops, best of 3: 0.507 usec per loop

so the standard-idiom version saves about 33% of the time, but that's when almost all calls do actually hit the memoization cache (which is populated after the first one of those million loops) -- fib2's speed advantage is smaller on cache misses, since it comes from the higher speed of accessing _memo (a local variable) vs fib.cache (a global name, fib, and then an attribute thereof, cache), and cache accesses dominate on cache hits (there's nothing else;-) but there's a little extra work (equal for both functions) on cache misses.

Anyway, don't mean to rain on your parade, but when you find some new cool idea be sure to measure it against the "good old way" of doing things, both in terms of "robustness" and performance (if you're caching, presumably you care about performance;-).

answered Nov 19 '09 at 03:12

Alex Martelli

854,459
170
1,222
1,395

Hi, Alex. No rain at all. First, I was happy to see I could get it to work at all. Second, I was wondering how to time it, and what to compare it against, not knowing the standard form. So, as usual, you've added lots to my understanding. There's something I don't understand, I see. I thought that the parameters were local variables that had no 'memory' from previous calls, yet here you've got _memo being modified in one call and then being used as modified in subsequent calls. How does that work? – behindthefall Nov 19 '09 at 03:24
`_memo` isn't a local variable - it's an optional parameter. If you observe the function `def t(n, o=[]): o.append(n); return o` you'll see that `t(1), t(2), t(3)` returns `[1], [1, 2], [1, 2, 3]`. This is because the assignment of optional parameters is done once, at the function's definition, and because `[]` is a list (and is mutable). If you don't want this to happen, you'd have to do `def t(n, o=None): if o == None: o = []; o.append(n); return o` which would produce the expected `[1], [2], [3]` output in the above test. However, this unintuitive behavior can be utilized for memoization. – Chris Lutz Nov 19 '09 at 03:55
@Chris, arguments **are** local variables -- they live in the local variable namespace, both logically and physically. Default values evaluation at `def`-time (which is unintuitive only to raw Python beginners;-) is useful for many purposes, of which memoization is just one -- forcing early (def-time) binding (while free-variables "of course" are late bound) being the most frequently used one. – Alex Martelli Nov 19 '09 at 04:04
@Chris Lutz - Holy Cow! Unintuitive behavior of optional parameters: I didn't see that one coming! Thank you VERY much for your explanation. I like it even more than the function dictionary. (What else does Python have hidden under the hood?) – behindthefall Nov 19 '09 at 04:09
Errr, I didn't know I was a 'raw' Python beginner ... rare tending to medium, maybe? No, seriously, this is stuff that hasn't jumped out at me from the books. – behindthefall Nov 19 '09 at 04:10
behindthefall, you may enjoy browsing this thread: http://stackoverflow.com/questions/101268/hidden-features-of-python – unutbu Nov 19 '09 at 04:21
@~unutbu - That's practically a book! Does it have a TOC? ;-) – behindthefall Nov 19 '09 at 04:37
Memoization is the kindest way to learn about how optional arguments are initialized :) – wisty Nov 19 '09 at 06:01
@behindthefall: the thing about when default function values are evaluated is in the Python tutorial: http://docs.python.org/tutorial/controlflow.html#defining-functions . Perhaps the books don't cover it because it's in the tutorial. – Kylotan Nov 19 '09 at 10:27
@Kylotan - Thanks for that tip. I have to admit that I haven't gone through the tutorial conscientiously, because I had all these Python books. (But it's probably in there, too, and I just didn't have a mental slot to fit it into at the time ...) Live and learn ... – behindthefall Nov 19 '09 at 15:27

score 1 · Answer 2 · answered Dec 30 '09 at 18:48

I've always used this technique for memoizing. It uses the fact that you can call an object that's not a function, so long as that object has its __call__ method defined. Neither behindthefall's method, nor Alex Martelli's even occurred to me, for whatever reason. I would guess the performance is roughly the same as behindthefall's way, because it works roughly the same way. Maybe a little slower. But as the linked page points out, "the definition for fib() is now the "obvious" one, without the cacheing code obscuring the algorithm" which is kind of nice.

Python func_dict used to memoize; other useful tricks?

2 Answers2