1

Consider a function f(*x) which takes a lot of arguments *x. Based on these arguments (objects), the function f composes a rather complex object o and returns it. o implements __call__, so o itself serves as a function. Since the composition of o is pretty time consuming and in my scenario there is no point in having multiple instances of o based on the same arguments *x, they are to be cached.

The question is now: How to efficiently compute a hash based on multiple arguments *x? Currently I am using a python dictionary, and i concatenate the str() representations of each x to build each key. It works in my scenario, but it feels rather awkward. I need to call the resulting objects o in a very high frequency, so I suspect the repeated call of str() and the string concatenations waste a lot of computation time.

schreon
  • 1,097
  • 11
  • 25
  • What are the types of the arguments in `*x`? – Lukas Graf Jun 09 '14 at 11:03
  • In my case, the arguments have a variety of types. I usually have tuples of integers, some floats and at least one str() representation of a context object coming from another library. – schreon Jun 09 '14 at 11:06
  • Then I would recommend simply calling [`hash()`](https://docs.python.org/2/library/functions.html#hash) on the individual arguments in order to use the same hash Python uses internally to compare dictionary keys for example. – Lukas Graf Jun 09 '14 at 11:10

2 Answers2

2

You can use the hash built-in function, combining the hashes of the items in x together. The typical way to do this (see e.g. the documentation) would be an xor across all the hashes of the individual objects:

it is advised to somehow mix together (e.g. using exclusive or) the hash values for the components of the object that also play a part in comparison of objects

To implement that in a functional way, using operator and reduce:

from functools import reduce # only required in Python 3.x
from operator import xor

def hashed(*x):
    return reduce(xor, map(hash, x))

See also this question.

Community
  • 1
  • 1
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
2

Since version 3.2, Python already contains an implementation of an LRU cache that you can use to cache your functions' results based on their arguments: functools.lru_cache

Example:

from functools import lru_cache

@lru_cache(maxsize=32)
def f(*args):
    """Expensive function
    """
    print("f(%s) has been called." % (args, ))
    return sum(args)


print(f(1, 2, 3))
print(f(1, 2, 3, 4))
print(f(1, 2, 3))
print(f.cache_info())

Output:

f((1, 2, 3)) has been called.
6
f((1, 2, 3, 4)) has been called.
10
6
CacheInfo(hits=1, misses=2, maxsize=32, currsize=2)

(Notice how f(1, 2, 3) only got called once)

As suggested in the comments, it's probably best to simply use the hash()es of your arguments to build the cache-key for your arguments - that's what lru_cache already does for you.

If you're still on Python 2.7, Raymond Hettinger has posted some recipes with LRU caches that you could use in your own code.

Lukas Graf
  • 30,317
  • 8
  • 77
  • 92