2

I was going through the implementation details of Python's LRU cache decorator and noticed this behavior that I found a bit surprising. When decorated with either staticmethod or classmethod decorator, lru_cache disregards the maxsize limit. Consider this example:

# src.py

import time
from functools import lru_cache


class Foo:
    @staticmethod
    @lru_cache(3)
    def bar(x):
        time.sleep(3)
        return x + 5


def main():
    foo = Foo()
    print(foo.bar(10))
    print(foo.bar(10))
    print(foo.bar(10))

    foo1 = Foo()
    print(foo1.bar(10))
    print(foo1.bar(10))
    print(foo1.bar(10))

if __name__ == "__main__":
    main()

From the implementation, it's clear to me that using the LRU cache decorator in this way will create a shared cache for all the instances of class Foo. However, when I run the code, it waits for 3 seconds in the beginning and then prints out 15 six times without pausing in the middle.

$ python src.py 
# Waits for three seconds and then prints out 15 six times
15
15
15
15
15
15

I was expecting it to—

  • Wait for 3 seconds.
  • Then print 15 three times.
  • Then wait again for 3 seconds.
  • And finally, print 15 three times.

Running the above code with an instance method behaves in the way that I've explained in the bullet points.

Inspecting the foo.bar method with cache info gives me the following result:

print(f"{foo.bar.cache_info()=}")
print(f"{foo1.bar.cache_info()=}")
foo.bar.cache_info()=CacheInfo(hits=5, misses=1, maxsize=3, currsize=1)
foo1.bar.cache_info()=CacheInfo(hits=5, misses=1, maxsize=3, currsize=1)

How is this possible? The cache info named tuple is the same for both foo and foo1 instance—which is expected—but how come the LRU cache is behaving as if it was applied as lru_cache(None)(func). Is this because of descriptor intervention or something else? Why is it disregarding the cache limit? Why does running the code with an instance method works as explained above?

Edit: As Klaus mentioned in the comment that this is caching 3 keys, not 3 accesses. So, for a key to be evicted, the method needs to be called at least 4 times with different arguments. That explains why it's quickly printing 15 six times without pausing in the middle. It's not exactly disregarding the maxlimit.

Also, in the case of an instance method, lru_cache utilizes the self argument to hash and build the key for each argument in the cache dictionary. So each new instance method will have a different key for the same argument due to the inclusion of self in the hash calculation. For static methods, there's no self argument and for class methods, the cls is the same class in different instances. This explains the differences in their behaviors.

Joe
  • 29,416
  • 12
  • 68
  • 88
Redowan Delowar
  • 1,580
  • 1
  • 14
  • 36
  • Like this you are caching 3 keys, not 3 accesses. For a key to be dropped you need at least 4 *different* calls. – Klaus D. Dec 19 '21 at 07:46
  • Hmm..in that case, why does it act differently on an instance method? In the case of an instance method, running the code above waits for 3 seconds, then prints 15 thrice, then waits again for 3 seconds, and finally prints 15 thrice. I still find this a tad bit weird. – Redowan Delowar Dec 19 '21 at 07:52
  • 1
    The instance is part of the call. New instance, new cache key. – Klaus D. Dec 19 '21 at 08:03
  • I see. So basically it uses the `self` argument in the cache's hashed key, right? Do you mind adding that as an answer that I can accept? Thanks a bunch! – Redowan Delowar Dec 19 '21 at 08:05

1 Answers1

2

Like you said in your edit, the @staticmethod and @classmethod decorators will make the cache be shared among all instances.

import time
from functools import lru_cache


class Foo:
    @staticmethod
    @lru_cache(3)
    def foo(x):
        time.sleep(1)
        return x + 5


class Bar:
    @lru_cache(3)
    def bar(self, x):
        time.sleep(1)
        return x + 5

def main():
    # fill the shared cache takes around 3 seconds
    foo0 = Foo()
    print(foo0.foo(10), foo0.foo(11), foo0.foo(12))

    # get from the shared cache takes very little time
    foo1 = Foo()
    print(foo1.foo(10), foo1.foo(11), foo1.foo(12))

    # fill the instance 0 cache takes around 3 seconds
    bar0 = Bar()
    print(bar0.bar(10), bar0.bar(11), bar0.bar(12))

    # fill the instance 1 cache takes around 3 seconds again 
    bar1 = Bar()
    print(bar1.bar(10), bar1.bar(11), bar1.bar(12))

if __name__ == "__main__":
    main()
ljmc
  • 4,830
  • 2
  • 7
  • 26