10

I have a function that returns a list, say list_x.

def result(val):
    ..
    return(list_x)

I am calling result() every minute and storing the list.

def other_func():
    #called every minute
    new_list = result(val)

I would like to store the value of new_list for an hour (in some sort of in-memory cache may be?) and then update it again, basically call results() after an hour and not every minute.I read about functools.lru_cache but that will not help here I think. Any ideas?

Asclepius
  • 57,944
  • 17
  • 167
  • 143
user2715898
  • 921
  • 3
  • 13
  • 20
  • 2
    If you're on a unix based system, I would recommend using `cron` to handle this. – user3483203 Jun 14 '18 at 22:45
  • What's the reason for not liking `lru_cache`? It's oriented around the number of calls rather than time duration, but that's not a huge difference. If nothing else, you could look at the implementation of `lru_cache` for ideas on how to write your own. – BowlingHawk95 Jun 14 '18 at 22:45
  • not on unix sadly @user – user2715898 Jun 14 '18 at 22:47
  • @BowlingHawk95 looking for ideas if there is already something implemented along these lines rather than writing my own stuff – user2715898 Jun 14 '18 at 22:48
  • what have you tried ? can you cache it forever and you dont know how to add a timelimit? start by caching it forever – Joran Beasley Jun 14 '18 at 22:48
  • 1
    @user2715898 google "python memoization decorator" hopefully you can take one of the existing solutions and understand it ... and figure out how you might add a timelimit – Joran Beasley Jun 14 '18 at 22:49
  • Possible duplicate of [Python in-memory cache with time to live](https://stackoverflow.com/questions/31771286/python-in-memory-cache-with-time-to-live) – Louis Aug 30 '19 at 15:43

6 Answers6

25

The ttl_cache decorator in cachetools==3.1.0 works a lot like functools.lru_cache, but with a time to live.

import cachetools.func

@cachetools.func.ttl_cache(maxsize=128, ttl=10 * 60)
def example_function(key):
    return get_expensively_computed_value(key)


class ExampleClass:
    EXP = 2

    @classmethod
    @cachetools.func.ttl_cache()
    def example_classmethod(cls, i):
        return i* cls.EXP

    @staticmethod
    @cachetools.func.ttl_cache()
    def example_staticmethod(i):
        return i * 3
Asclepius
  • 57,944
  • 17
  • 167
  • 143
11

Building a single-element cache with a time-to-live is pretty trivial:

_last_result_time = None
_last_result_value = None
def result(val):
    global _last_result_time
    global _last_result_value
    now = datetime.datetime.now()
    if not _last_result_time or now - _last_result_time > datetime.timedelta(hours=1):
        _last_result_value = <expensive computation here>
        _last_result_time = now
    return _last_result_value

If you want to generalize this as a decorator, it's not much harder:

def cache(ttl=datetime.timedelta(hours=1)):
    def wrap(func):
        time, value = None, None
        @functools.wraps(func)
        def wrapped(*args, **kw):
            nonlocal time
            nonlocal value
            now = datetime.datetime.now()
            if not time or now - time > ttl:
                value = func(*args, **kw)
                time = now
            return value
        return wrapped
    return wrap

If you want it to handle different arguments, storing a time-to-live for each one:

def cache(ttl=datetime.timedelta(hours=1)):
    def wrap(func):
        cache = {}
        @functools.wraps(func)
        def wrapped(*args, **kw):
            now = datetime.datetime.now()
            # see lru_cache for fancier alternatives
            key = tuple(args), frozenset(kw.items()) 
            if key not in cache or now - cache[key][0] > ttl:
                value = func(*args, **kw)
                cache[key] = (now, value)
            return cache[key][1]
        return wrapped
    return wrap

You can of course key adding features to it—give it a max size and evict by time of storage or by LRU or whatever else you want, expose cache stats as attributes on the decorated function, etc. The implementation of lru_cache in the stdlib should help show you how to do most of the trickier things (since it does almost all of them).

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • this works fine assuming you don't want to cache multiple results for different arguments to the function. Would highly recommend 1. turning this into a decorator to wrap around a function doing the computation and 2. turning that decorator into an instance of a class with a `__call__` implementation, so that way the statefulness of the cache is hidden inside a object's member variables rather than globals. – BowlingHawk95 Jun 14 '18 at 22:53
  • @BowlingHawk95 Sure, I wrote the simplest version first because that's what the OP explicitly asked for. – abarnert Jun 14 '18 at 22:54
  • right, hence why I didn't downvote; just proposing improvements :) see Joran's answer above – BowlingHawk95 Jun 14 '18 at 22:56
  • oooo fancy ... nonlocal (first time ive actually seen that ...thats cool) (also this doesnt really account for function args kwargs ... but im sure you know that and can easily extend it to (ie different args = different cache value) – Joran Beasley Jun 14 '18 at 22:57
  • @BowlingHawk95 OK, I added first a decorator, then a version that handles args and kwargs if they're all hashable, then some comments on what other features you might want and how to find most of them in the `lru_cache` source, because I'm not going to keep going until I include every feature anyone might want. – abarnert Jun 14 '18 at 23:01
6

A decorator usually solves this nicely

def cache(fn=None,time_to_live=3600*24): # one DAY default (or whatever)
    if not fn: return functools.partial(cache,time_to_live=time_to_live)
    my_cache = {}
    def _inner_fn(*args,**kwargs)
        kws = sorted(kwargs.items()) # in python3.6+ you dont need sorted
        key = tuple(args)+tuple(kw) 
        if key not in my_cache or time.time() > my_cache[key]['expires']:
               my_cache[key] = {"value":fn(*args,**kwargs),"expires":time.time()+ time_to_live}
        return my_cache[key]
    return __inner_fn

@cache(time_to_live=3600) # an hour
def my_sqrt(x):
    return x**0.5

@cache(time_to_live=60*30) # 30 mins
def get_new_emails():
    return my_stmp.get_email_count()

as an aside this is built into memcache and that may be a better solution (im not sure what problem domain you are working in)

you can use nested functions also

def cache(time_to_live=3600*24): # one DAY default (or whatever)
    def _wrap(fn):
        my_cache = {}
        def _inner_fn(*args,**kwargs)
            kws = sorted(kwargs.items()) # in python3.6+ you dont need sorted
            key = tuple(args)+tuple(kw) 
            if key not in my_cache or time.time() > my_cache[key]['expires']:
                 my_cache[key] = {"value":fn(*args,**kwargs),"expires":time.time()+ time_to_live}
            return my_cache[key]
         return _inner_fn
    return _wrap
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
  • The recursive setup with `partial` is clever, but IMO uglier than just having another layer of closure. You're now allowing `cache` to be called in two different ways: `cache(time_to_live=___)(function)` or `cache(function, time_to_live=___)`. Just a matter of taste, though, and I like the solution. – BowlingHawk95 Jun 14 '18 at 22:59
  • you sould only really call it as `@cache` or `@cache(time_to_live=1000)` ... but yeah you have a good point ... i dislike nesting functions ... and in general i like this as a solution so I always use it when accepting kwargs for a decorator (I **DISLIKE** `@cache(1000)` and want to not allow it) – Joran Beasley Jun 14 '18 at 23:02
  • 1
    If you want to ban `@cache(1000)`, this doesn't do so nearly as clearly as making `time_to_live` a keyword-only parameter. – abarnert Jun 14 '18 at 23:05
  • but then you need a pyhton that supports that :P (which TBH they should have... but theres no guarantee or indication from the problem statement) – Joran Beasley Jun 14 '18 at 23:05
  • Im also not sure i agree `def cache(*ignore, time_to_live=XXX)` is actually any more clear – Joran Beasley Jun 14 '18 at 23:08
  • 1
    @JoranBeasley I assumed you were writing for Python 3, because before 3.6+, `kwargs` is explicitly in arbitrary order, so you either want `tuple(sorted(kwargs.items()))` or `frozenset(kwargs.items())`. And also because it's not 2008 anymore. – abarnert Jun 14 '18 at 23:08
  • Not `*ignore`, just `*`; there's no reason to allow but ignore extra positional arguments just because you want keyword-only arguments. – abarnert Jun 14 '18 at 23:10
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/173182/discussion-between-joran-beasley-and-abarnert). – Joran Beasley Jun 14 '18 at 23:10
1

Create a function that acts as a cache, we'll call it result_cacher.

import time 
lastResultCache = 0 
resultCache = None
def result_cacher():
    if time.time() - lastResultCache >= 3600: #Checks if 3600 sec (1 hour) has passed since the last cache 
        lastResultCache = time.time()
        resultCache = result()
    return resultCache 

This function checks if an hour has passed, updates the cache if it has, and then returns the cache.

If you want to apply the caching for each individual input instead of for whenever the function is called, use dictionaries for lastResultCache and resultCache.

import time 
lastResultCache = {}
resultCache = {}
def result_cacher(val):
    #.get() gets value for key from dict, but if the key is not in the dict, it returns 0
    if time.time() - lastResultCache.get(val, 0) >= 3600: #Checks if 3600 sec (1 hour) has passed since the last cache 
        lastResultCache[val] = time.time()
        resultCache[val] = result(val)
    return resultCache.get(val)
1

A solution using ring

@ring.lru(expire=60*60)  # seconds
def cached_function(keys):
    return ...

If you don't need LRU policy,

@ring.dict(expire=60*60)  # seconds
def cached_function(keys):
    return ...
youknowone
  • 919
  • 6
  • 14
-1

You can create a mechanism like this:

from asyncio import get_running_loop, get_event_loop
from functools import partial


class Cache(dict):
    def __init__(self, default_time, loop=None, *args, **kwargs):
        if not loop:
            try:
                loop = get_running_loop()
            except RuntimeError:
                loop = get_event_loop()
        self.loop = loop
        self.default_time = default_time
        super().__init__(*args, **kwargs)

    def insert(self, key, value, ttl: int = -1):
        if ttl == -1:
            ttl = self.default_time

        self[key] = value
        self.loop.call_later(ttl, partial(self.delete, key))

    def delete(self, key):
        del self[key]

Which lets you create a cache along with your async application. All you need to do is:

TTL = 60 * 60 # one hour
cache = Cache(TTL)
cache.insert(key, value)

# check again after an hour

if key in cache:
    print(cache[key])

The key will disappear automatically from the cache (dictionary) after the TTL.

Well, it doesn't have to be with an async application, just the mechanism of the TTL is async.

Jonathan1609
  • 1,809
  • 1
  • 5
  • 22
  • This answer is interesting but it doesn't actually delete the cached item unless someone runs the loop. In other words if it falls into the get_event_loop block, unless you start the loop, the cache won't expire. – csm10495 Mar 06 '22 at 06:47