4

Common Lisp has defvar which creates a global variable but only sets it if it is new: if it already exists, it is not reset. This is useful when reloading a file from a long running interactive process, because it keeps the data.

I want the same in Python. I have file foo.py which contains something like this:

cache = {}
def expensive(x):
    try:
        return cache[x]
    except KeyError:
        # do a lot of work
        cache[x] = res
        return res

When I do imp.reload(foo), the value of cache is lost which I want to avoid.

How do I keep cache across reload?

PS. I guess I can follow How do I check if a variable exists? :

if 'cache' not in globals():
   cache = {}

but it does not look "Pythonic" for some reason... If it is TRT, please tell me so!

Answering comments:

  • I am not interested in cross-invocation persistence; I am already handling that.
  • I am painfully aware that reloading changes class meta-objects and I am already handling that.
  • The values in cache are huge, I cannot go to disk every time I need them.
martineau
  • 119,623
  • 25
  • 170
  • 301
sds
  • 58,617
  • 29
  • 161
  • 278
  • Why not use a different scope? Create a `get_cache` function in `foo,py` and store the cache in the same file you call `imp.reload()` – rafaelc Feb 15 '19 at 14:25
  • I suppose the easiest way would be to use dump a json file – Nathan Feb 15 '19 at 14:25
  • This is the key: `If a module instantiates instances of a class, reloading the module that defines the class does not affect the method definitions of the instances — they continue to use the old class definition. The same is true for derived classes.` – Adelin Feb 15 '19 at 14:25
  • 2
    The code you've provided in your PS looks like it absolutely should work, have you tried it? "When a module is reloaded, its dictionary (containing the module’s global variables) is retained. Redefinitions of names will override the old definitions, so this is generally not a problem. If the new version of a module does not define a name that was defined by the old version, the old definition remains. This feature can be used to the module’s advantage if it maintains a global table or cache of objects". But reloading is probably a bad idea anyway, avoid it if you can. – Alex Hall Feb 15 '19 at 14:28
  • 1
    You can use _persistent storage_, a.k.a. the file system, and just `pickle` the contents of the variable. – ForceBru Feb 15 '19 at 14:34
  • I'm confused. You say you already handle cross-invocation persistence, so surely that should also take care of module reloads? A module reload *is* essentially an invocation of your module, after all. – Aran-Fey Feb 15 '19 at 14:35

2 Answers2

2

Here are a couple of options. One is to use a temporary file as persistent storage for your cache, and try to load every time you load the module:

# foo.py
import tempfile
import pathlib
import pickle

_CACHE_TEMP_FILE_NAME = '__foo_cache__.pickle'
_CACHE = {}

def expensive(x):
    try:
        return _CACHE[x]
    except KeyError:
        # do a lot of work
        _CACHE[x] = res
        _save_cache()
        return res

def _save_cache():
    tmp = pathlib.Path(tempfile.gettempdir(), _CACHE_TEMP_FILE_NAME)
    with tmp.open('wb') as f:
        pickle.dump(_CACHE, f)

def _load_cache():
    global _CACHE
    tmp = pathlib.Path(tempfile.gettempdir(), _CACHE_TEMP_FILE_NAME)
    if not tmp.is_file():
        return
    try:
        with tmp.open('rb') as f:
            _CACHE = pickle.load(f)
    except pickle.UnpicklingError:
        pass

_load_cache()

The only issue with this is that you need to trust the environment not to write anything malicious in place of the temporary file (the pickle module is not secure against erroneous or maliciously constructed data).

Another option is to use another module for the cache, one that does not get reloaded:

# foo_cache.py
Cache = {}

And then:

# foo.py
import foo_cache

def expensive(x):
    try:
        return foo_cache.Cache[x]
    except KeyError:
        # do a lot of work
        foo_cache.Cache[x] = res
        return res
jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • 1
    I think the `foo_cache` solution is perfect. I hate it that one has to create a separate file for that, but, I guess, I will have to live with it. – sds Feb 15 '19 at 14:46
0

Since the whole point of a reload is to ensure that the executed module's code is run a second time, there is essentially no way to avoid some kind of "reload detection."

The code you use appears to be the best answer from those given in the question you reference.

holdenweb
  • 33,305
  • 7
  • 57
  • 77