Identifying equivalent varargs function calls for memoization

Question

I'm using a variant of the this decorator for memoization:

# note that this decorator ignores **kwargs
def memoize(obj):
    cache = obj.cache = {}

    @functools.wraps(obj)
    def memoizer(*args, **kwargs):
        if args not in cache:
            cache[args] = obj(*args, **kwargs)
        return cache[args]
    return memoizer

I'm wondering, is there a reasonable way to memoize based on both args and kwargs, particularly in cases where two function calls specified with arguments assigned differently positionally and through keyword, but have the exact same arguments?

Bakuriu · Accepted Answer · 2013-01-31T11:10:45.820

If you are using parameters either always as positionals or always as keywords, Thorsten solution works fine. But, if you want to consider equal calls that give to the parameters the same values, indipendently of how the parameters are passed, then you have to do something more complex:

import inspect


def make_key_maker(func):
    args_spec = inspect.getargspec(func)

    def key_maker(*args, **kwargs):
        left_args = args_spec.args[len(args):]
        num_defaults = len(args_spec.defaults or ())
        defaults_names = args_spec.args[-num_defaults:]

        if not set(left_args).symmetric_difference(kwargs).issubset(defaults_names):
            # We got an error in the function call. Let's simply trigger it
            func(*args, **kwargs)

        start = 0
        key = []
        for arg, arg_name in zip(args, args_spec.args):
            key.append(arg)
            if arg_name in defaults_names:
                start += 1

        for left_arg in left_args:
            try:
                key.append(kwargs[left_arg])
            except KeyError:
                key.append(args_spec.defaults[start])

            # Increase index if we used a default, or if the argument was provided
            if left_arg in defaults_names:
                start += 1
        return tuple(key)

    return key_maker

The above functions tries to map keyword arguments(and defaults) to positional and uses the resultant tuple as key. I tested it a bit and it seems to work properly in most cases. It fails when the target function also uses a **kwargs argument.

>>> def my_function(a,b,c,d,e=True,f="something"): pass
... 
>>> key_maker = make_key_maker(my_function)
>>> 
>>> key_maker(1,2,3,4)
(1, 2, 3, 4, True, 'something')
>>> key_maker(1,2,3,4, e=True)               # same as before
(1, 2, 3, 4, True, 'something')
>>> key_maker(1,2,3,4, True)                 # same as before
(1, 2, 3, 4, True, 'something')
>>> key_maker(1,2,3,4, True, f="something")  # same as before
(1, 2, 3, 4, True, 'something')
>>> key_maker(1,2,3,4, True, "something")    # same as before
(1, 2, 3, 4, True, 'something')
>>> key_maker(1,2,3,d=4)                     # same as before
(1, 2, 3, 4, True, 'something')
>>> key_maker(1,2,3,d=4, f="something")      # same as before
(1, 2, 3, 4, True, 'something')

So I suppose the answer is that there's no simple way. Correct me if I'm wrong, but it seems like you've done all the dirty work, and at this point, to handle kwargs, it would work to pop any used kwargs, and then make a final entry in the generated key a tuple of remaining key-value pairs, sorted by argument name? — acjay, Jan 31 '13 at 16:48
@acjohnson55 Uhm it might be easier, but I don't know whether it will always work. Keep in mind you always have to deal with defaults, so you must always do some trick to find out which defaults where used. I don't know whether the resultant code will be much smaller/simpler. — Bakuriu, Jan 31 '13 at 17:18
Alright, well, it seems like for now I'm better off just keeping funsction I want to memoize simple, but I think your approach is the right basis for something that could work in the general case. — acjay, Jan 31 '13 at 19:05

ndpu · Answer 2 · 2013-02-01T08:13:01.717

1

import inspect
def memoize(obj):
    cache = obj.cache = {}
    @functools.wraps(obj)
    def memoizer(*args, **kwargs):
        kwargs.update(dict(zip(inspect.getargspec(obj).args, args)))
        key = tuple(kwargs.get(k, None) for k in inspect.getargspec(obj).args)
        if key not in cache:
            cache[key] = obj(**kwargs)
        return cache[key]
    return memoizer

edited Feb 01 '13 at 08:13

answered Jan 31 '13 at 10:54

ndpu

22,225
6
54
69

1

This does not work. `co_varnames` is the list of *all* variables defined in the function. If you want the names of the parameters you must use `inspect.getargspec` – Bakuriu Jan 31 '13 at 10:57
`>>> def my_func(a,b,c,d): e=5 ... >>> print my_func.__code__.co_varnames ('a', 'b', 'c', 'd', 'e')` Note the `e` at the end which is not a parameter at all! – Bakuriu Jan 31 '13 at 11:11
Yes. `co_varnames` lists the names of all local variables of the function. This includes the parameters but only for extremely simple function it will contain only the parameters. – Bakuriu Jan 31 '13 at 11:58

score 1 · Answer 3 · answered Jan 31 '13 at 11:07

1

In general, it's not possible to infer that two calls have the same parameter meaning. Consider the calls

func(foo=1)
func(1)
func(bar=1)

Which of these (if any) are equivalent depends on whether the positional argument is called foo or bar: if the argument is called foo, then the first call matches the second, etc. However, the positional parameter may also have an entirely different name.

IOW, you need to reflect on the function to be called, which, in turn, may not be possible (e.g. if it is implemented in C, or is itself a wrapper that only processes *args, **kwargs).

If you want to go the reflection route, something like ndpu's response is a good start.

answered Jan 31 '13 at 11:07

Martin v. Löwis

124,830
17
198
235

If the function is a python function it *is* possible to determine how the parameters are parsed. You simply have to use `inspect.getargspec` to obtain all the required informatoin(parameter names, order, defaults values, starred arguments etc.) – Bakuriu Jan 31 '13 at 11:53
Try inspect.getargspec("".split) – Martin v. Löwis Jan 31 '13 at 13:12
Ahem. Please read again the first 7 words of my previous comment, then try `inspect.getargspec("".split)` on a python command line and read the exception message: `TypeError: is not a` **`Python function`** – Bakuriu Jan 31 '13 at 17:14
You are right. I misread your comment as stating that it *is* possible to determine how parameters are parsed for a function. – Martin v. Löwis Jan 31 '13 at 17:26

score 0 · Answer 4 · answered Jan 31 '13 at 09:44

You just have to find a good way to build a key from both args and kwargs. Maybe try this:

import functools
from collections import OrderedDict

# note that this decorator ignores **kwargs
def memoize(obj):
    def make_key(args, kwargs):
        ordered_kwargs = OrderedDict(kwargs)
        parameters = tuple([args, 
                            tuple(ordered_kwargs.keys()), 
                            tuple(ordered_kwargs.values())])
        return parameters
    cache = obj.cache = {}

    @functools.wraps(obj)
    def memoizer(*args, **kwargs):
        key = make_key(args, kwargs)
        if key not in cache:
            cache[key] = obj(*args, **kwargs)
            print "Not using cached result for key %s" % str(key)
        else:
            print "Using cached result for key %s" % str(key)
        return cache[key]
    return memoizer

@memoize
def calculate_sum(*args, **kwargs):
    return sum(args)

calculate_sum(4,7,9,2,flag=0)
calculate_sum(4,7,9,3)
calculate_sum(4,7,9,2,flag=1)
calculate_sum(4,7,9,2,flag=0)

I put some print-statements into memoizer, just to demonstrate that it works. Output is:

Not using cached result for key ((4, 7, 9, 2), ('flag',), (0,))
Not using cached result for key ((4, 7, 9, 3), (), ())
Not using cached result for key ((4, 7, 9, 2), ('flag',), (1,))
Using cached result for key ((4, 7, 9, 2), ('flag',), (0,))

I'm sure I didn't tackle all corner-cases, especially if the values passed-in as kwargs (or even args) are not hashable. But maybe it can serve as a good starting-point.

This may work when you have some parameters accessed *only* as positional arguments, and other parameters accessed *only* as keywords. If you want to consider equals the calls `f(a,b,third_param=c)` and `f(a,b,c)` then you must do something more complex to map the keywords to the correct positional arguments. It's probably required to use `inspect.getargspec` — Bakuriu, Jan 31 '13 at 10:26
By the way why are you using `OrderedDict`? According to python's [documentation](http://docs.python.org/2/library/stdtypes.html#dict.items) calls to `items`/`keys`/`values` etc *without* modifying the `dict` will all have the same order, so doing `zip(kwargs.keys(), kwargs.values())` will always match the keys to the correct values. — Bakuriu, Jan 31 '13 at 11:56
(1) Yes, you'Re right, I didn't tackle this requirement, will think about it for a while (2) What about calling `calculate_sum(4,6,flag=0,x=7)` vs. `calculate_sum(4,6,x=7,flag=0)`? they are equivalent, and should be treated so. — Thorsten Kranz, Jan 31 '13 at 12:06

MRule · Answer 5 · 2016-03-04T20:05:24.247

This solution uses the inspect module to extract parameter names for both positional and keyword arguments. The memoization lookup is then performed on the ordered tuple of name:value pairs. It can tolerate parameters being passed as both positional and keyword arguments. If there are excess positional arguments, these are stored in the order they appear in a separate tuple.

This uses Michele Simionato's decorator package to ensure that function signatures are preserved. Because it inspects the argspec of the function being memoized, it will fail if composed with decorator implementations that do not preserve the argspec.

from decorator import decorator as robust_decorator

def argument_signature(function,*args,**kwargs):
    '''
    Convert the function arguments and values to a unique set. 
    Throws ValueError if the provided arguments cannot match argspec.
    '''
    named_store = {} # map from parameter names to values 
    named,vargname,kwargname,defaults = inspect.getargspec(function)
    available = zip(named,args)
    nargs     = len(available)
    ndefault  = len(defaults) if not defaults is None else 0
    nnamed    = len(named)
    # All positional arguments must be filled
    nmandatory = nnamed - ndefault
    if nargs<nmandatory: raise ValueError('Not enough positional arguments')
    # Assign available positional arguments to names    
    for k,v in available:
        if k in named_store: raise ValueError('Duplicate argument',k)
        named_store[k] = v
    # If not all arguments are provided, check **kwargs and defaults
    ndefaulted   = max(0,nnamed - nargs)
    default_map = dict(zip(named[-ndefault:],defaults)) if ndefault>0 else {}
    if ndefaulted>0:
        for k in named[-ndefaulted:]:
            if k in named_store: raise ValueError('Duplicate argument',k)
            named_store[k] = kwargs[k] if k in kwargs else default_map[k]
            if k in kwargs: del kwargs[k]
    # Store excess positional arguments in *vargs if possible
    vargs = None
    if len(args)>nnamed:
        if vargname is None:
            raise ValueError('Excess positional arguments, but function does not accept *vargs!')
        vargs = args[nnamed:]
    # Store excess keyword arguments if the function accepts **kwargs
    if len(kwargs):
        if kwargname is None:
            raise ValueError("Excess keyword arguments, but function does not accept **kwargs!")
        for k in kwargs:
            if k in named_store: raise ValueError('Duplicate argument',k)
            named_store[k] = kwargs[k]
    # Construct a tuple reflecting argument signature
    keys  = sorted(named_store.keys())
    vals  = tuple(named_store[k] for k in keys)
    named = tuple(zip(keys,vals))
    argument_signature = (named,vargs)
    return argument_signature

def print_signature(sig):
    '''Formats the argument signature for printing.'''
    named, vargs = sig
    result = ','.join(['%s=%s'%(k,v) for (k,v) in named])
    if not vargs is None: result += '; ' + ','.join(map(str,vargs))
    return result

def vararg_memoize(f):
    '''Memoization decorator'''
    cache = {}
    @robust_decorator
    def decorated(f,*args,**kwargs):
        sig = argument_signature(f,*args,**kwargs)
        if not sig in cache:  cache[sig] = f(*args,**kwargs)
        else: print('found cached',f.func_name,print_signature(sig))
        return cache[sig]
    return decorated(f)

if __name__=="__main__":
    print("Running example and testing code")

    def example_function(a,b,c=1,d=('ok',),*vargs,**kw):
        ''' This docstring should be preserved by the decorator '''
        e,f = vargs if (len(vargs)==2) else (None,None)
        g = kw['k'] if 'k' in kw else None
        print(a,b,c,d,e,f,g)

    f = example_function
    g = vararg_memoize(example_function)

    for fn in [f,g]:
        print('Testing',fn.__name__)
        fn('a','b','c','d')
        fn('a','b','c','d','e','f')
        fn('a','b',c='c',d='d')
        fn('a','b',**{'c':'c','d':'d'})
        fn('a','b',*['c','d'])
        fn('a','b',d='d',*['c'])
        fn('a','b',*['c'],**{'d':'d'})
        fn('a','b','c','d','e','f')

Identifying equivalent varargs function calls for memoization

5 Answers5

Linked