5

I have written a little class to persistently memoize some expensive functions that do various statistical analyses of random networks.

These are all pure functions; all the data is immutable. However, some of the functions take functions as arguments.

Making keys based on these arguments is a small problem, since in Python function object equality is equivalent to function object identity, which does not persist between sessions, even if the function implementation does not change.

I am hacking around this for the time being by using the function name as a string, but this raises its own swarm of issues when one starts thinking about changing the implementation of the function or anonymous functions and so on. But I am probably not the first to worry about such things.

Does anybody have any strategies for persistently memoizing functions with function arguments in Python?

mvanveen
  • 9,754
  • 8
  • 33
  • 42
Gabriel Mitchell
  • 979
  • 5
  • 17
  • possible duplicate of [Persistent memoization in Python](http://stackoverflow.com/questions/9320463/persistent-memoization-in-python) – Dana the Sane Feb 19 '12 at 22:59
  • 2
    @DanatheSane that is not a duplicate --- it does not discuss persistent memoization between invocations where functions are arguments to the memoized function. – tobyodavies Feb 19 '12 at 23:15

2 Answers2

3

One option would be to use marshal.dumps(function.func_code)

It'll produce a string representation for the code of the function. That should handle changing implementations and anonymous functions.

D R
  • 21,936
  • 38
  • 112
  • 149
Winston Ewert
  • 44,070
  • 10
  • 68
  • 83
2

Have a look at using this as the identity of the function

[getattr(func.__code__,s) 
 for s in ['co_argcount', 'co_cellvars', 'co_code', 'co_consts', 
           'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars',
           'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize',
           'co_varnames']
]

that should correctly handle changing the implementation in any way...

tobyodavies
  • 27,347
  • 5
  • 42
  • 57
  • Thanks. I think something like this recursing over memoized functions in 'co_names' should do it. I was happily surprised that the 'co_code' was the same across a number of different of different platforms and python versions in my initial testing. I guess I need to read up on the Python virtual machine. – Gabriel Mitchell Feb 20 '12 at 01:26
  • Would something similar work for general callable objects? While avoiding impurities such as global names is fine, prohibiting the use of any callable other than a function would be undesirable if it can be helped. – max Sep 17 '12 at 22:50
  • @max no, it won't. It would be impossible to do this generally as what constitutes a change in the implementation of a general callable is defined by the class of that callable. The closest you could hope is that the class is picklable or marshalable such that the representations are not identical between implementation changes. Alternatively you could have your own protocol to determine the version of a callable's implementation. – tobyodavies Sep 17 '12 at 23:04
  • Well, we cannot redefine the "useless" `__hash__` and `__eq__` methods of the built-in `function` objects, so the memoizer has to recursively look at them to figure out if they changed. But for callable classes, we have more flexibility. So I was thinking what if demand that they all define "meaningful" `__hash__` / `__eq__` methods, so that memoizer can depend on them to validate the cache. Would this approach work? Of course, it puts all the burden on callable class implementers, but perhaps in some cases implementers might find it reasonably easy? – max Sep 18 '12 at 09:11
  • @max it would not necessarily place any burden on function implementations --- https://gist.github.com/3754066 – tobyodavies Sep 20 '12 at 05:05