48

I want to cache a function that takes a list as a parameter, but when I try to do so with the functools.lru_cache decorator, it fails with TypeError: unhashable type: 'list'.


import functools

@functools.lru_cache()
def example_func(lst):
    return sum(lst) + max(lst) + min(lst)


print(example_func([1, 2]))
redfast00
  • 1,363
  • 1
  • 12
  • 23
  • 1
    Possible duplicate of [Hashing arrays in Python](https://stackoverflow.com/questions/7027199/hashing-arrays-in-python) – Alex Mar 10 '18 at 15:57
  • 2
    @Alex just putting this here because googling this ("lrucache python list") didn't find a lot. I then made a custom class with a custom hash function. I later asked this to a professional Python dev, and he suggested using a tuple. I do think these two questions are related, but not duplicates. – redfast00 Mar 10 '18 at 20:12

5 Answers5

58

This fails because a list is unhashable. This would make it hard for Python to know what values are cached. A way to fix this is by converting lists to tuples before passing them to a cached function: since tuples are immutable and hashable, they can be cached.

TL;DR

Use a tuple instead of a list:

>>> @lru_cache(maxsize=2)
... def my_function(args):
...     pass
...
>>> my_function([1,2,3])
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    my_function([1,2,3])
TypeError: unhashable type: 'list'

>>> # TO FIX: use a tuple 

>>> my_function(tuple([1,2,3]))
>>>
Aziz Alto
  • 19,057
  • 5
  • 77
  • 60
redfast00
  • 1,363
  • 1
  • 12
  • 23
  • 1
    If using mypy and the list could be any length, you'll have to annotate the type as a homogenous variable length tuple: `Tuple[int, ...]`. See https://stackoverflow.com/a/54747327/733092. – Noumenon Jul 01 '21 at 17:13
16

It should not throw an error, rather convert into hash-able form within decorator without user even knowing it. You can fix this problem by decorating your functions like this:

#Custom Decorator function
def listToTuple(function):
    def wrapper(*args):
        args = [tuple(x) if type(x) == list else x for x in args]
        result = function(*args)
        result = tuple(result) if type(result) == list else result
        return result
    return wrapper

#your cached function
@listToTuple
@lru_cache(maxsize=cacheMaxSize)
def checkIfAdminAcquired(self, adminId) -> list:
    query = "SELECT id FROM public.admins WHERE id IN ({}) and 
    confirmed_at IS NOT NULL"
    response = self.handleQuery(query, "int", adminId)
    return response

You might want to use yet another decorator after lru_cache to make sure that output of the function is not a tuple, but a list, since right now it will return tuple.

Donatas Svilpa
  • 291
  • 2
  • 3
  • 4
    Isn't it sufficient to convert `args` to a tuple? What is the purpose of also converting `result`? – dwitvliet Nov 26 '20 at 22:29
  • i guess because the cache is storing as the combination of so if your results is also not hashable, better make it hashable as well... – ahmed_khan_89 Apr 15 '22 at 15:56
  • To extend this for named arguments, add this `kwargs = {k: tuple(x) if type(x) == list else x for k, x in kwargs.items()}` to the function and modify the result assignment to `result = function(*args, **kwargs)` – Chris Mungall Jun 03 '23 at 22:05
4

Sometimes a parameter can take either a simple hashable type, or a complicated unhashable type without a straightforward conversion to be hashable, as the current answers propose. In this situation it may still be desirable to have a cache used for the (possibly more common) case of hashable type without using a cache or erroring out in the unhashable case - simply calling the underlying function.

This ignores the error and works generally for any hashable type:

import functools

def ignore_unhashable(func): 
    uncached = func.__wrapped__
    attributes = functools.WRAPPER_ASSIGNMENTS + ('cache_info', 'cache_clear')
    @functools.wraps(func, assigned=attributes) 
    def wrapper(*args, **kwargs): 
        try: 
            return func(*args, **kwargs) 
        except TypeError as error: 
            if 'unhashable type' in str(error): 
                return uncached(*args, **kwargs) 
            raise 
    wrapper.__uncached__ = uncached
    return wrapper

Usage and testing:

@ignore_unhashable
@functools.lru_cache()
def example_func(lst):
    return sum(lst) + max(lst) + min(lst)

example_func([1, 2]) # 6
example_func.cache_info()
# CacheInfo(hits=0, misses=0, maxsize=128, currsize=0)
example_func((1, 2)) # 6
example_func.cache_info()
# CacheInfo(hits=0, misses=1, maxsize=128, currsize=1)
example_func((1, 2)) # 6
example_func.cache_info()
# CacheInfo(hits=1, misses=1, maxsize=128, currsize=1)

Took me a moment to wrap my head around it, but example_func.__wrapped__ is the lru_cache's version and example_func.__uncached__ is the original version.

user2561747
  • 1,333
  • 2
  • 16
  • 39
3

You can extend functools.lru_cache to digest lists, dicts, and more. The key idea is passing a hashed value of arguments to lru_cache, not the raw arguments. The below is an exemplary implementation hashing lists and dicts in arguments.

from functools import lru_cache

def hash_list(l: list) -> int:
    __hash = 0
    for i, e in enumerate(l):
        __hash = hash((__hash, i, hash_item(e)))
    return __hash

def hash_dict(d: dict) -> int:
    __hash = 0
    for k, v in d.items():
        __hash = hash((__hash, k, hash_item(v)))
    return __hash

def hash_item(e) -> int:
    if hasattr(e, '__hash__') and callable(e.__hash__):
        try:
            return hash(e)
        except TypeError:
            pass
    if isinstance(e, (list, set, tuple)):
        return hash_list(list(e))
    elif isinstance(e, (dict)):
        return hash_dict(e)
    else:
        raise TypeError(f'unhashable type: {e.__class__}')

def my_lru_cache(*opts, **kwopts):
    def decorator(func):
        def wrapper(*args, **kwargs):
            __hash = hash_item([id(func)] + list(args) + list(kwargs.items()))

            @lru_cache(*opts, **kwopts)
            def cached_func(args_hash):
                return func(*args, **kwargs)
            
            return cached_func(__hash)
        return wrapper
    return decorator

Using my_lru_cache is exactly the same with original lru_cache.

@my_lru_cache(maxsize=None)
def example_func(lst):
    return sum(lst) + max(lst) + min(lst)


print(example_func([1, 2]))
D Kim
  • 41
  • 3
-1

If you do not need LRU, and all arguments are references, you may use this simple implementation.

import time


def cacheRef(f):
    cache = {}

    def g(*args):
        # use `id` to get memory address for function argument.
        cache_key = '-'.join(list(map(lambda e: str(id(e)), args)))
        if cache_key in cache:
            return cache[cache_key]
        v = f(*args)
        cache[cache_key] = v
        return v

    return g


@cacheRef
def someHeavyWork(p1):
    time.sleep(3)
    return ''.join(p1)


l1 = ['a', 'b', 'c']
l2 = ['d', 'e', 'f']

t0 = time.time()
print(int(time.time() - t0), someHeavyWork(l1))
print(int(time.time() - t0), someHeavyWork(l1))
print(int(time.time() - t0), someHeavyWork(l1))
print(int(time.time() - t0), someHeavyWork(l2))
print(int(time.time() - t0), someHeavyWork(l2))
print(int(time.time() - t0), someHeavyWork(l2))

'''
output:
0 abc
3 abc
3 abc
3 def
6 def
6 def
'''


Yin
  • 612
  • 7
  • 10
  • This does not use functools.lru_cache, so for example setting a maxsize does not work. Also, `seq` does not seem to be a built-in in Python 3, where do you get that function from? – redfast00 Nov 13 '21 at 19:55
  • @redfast00 thanks for advice. my code only works for some situations, I added the situations. – Yin Nov 16 '21 at 08:22