I want to cache results of some functions/methods, with these specifications:
- Live between runs: The cache should remain intact between runs, after the interpreter dies, meaning the data needs to be saved to disk.
- Expiration based on function version: Data in the cache should remain valid as long as the function hasn't changed. If the function changed, it should invalidate the data.
- It's all happening single-threadedly on the same machine, for now. Support of concurrency on the same machine is a "bonus".
I know there are cache decorators for disk-based cache, but their expiration is usually based on time, which is irrelevant to my needs.
I thought about using the Git commit SHA for detecting function/class version, but the problem is that there are multiple functions/classes in the same file. I need a way to check whether the specific function/class segment of the file was changed or not.
I assume the solution will consist of a combination of version managing and caching, but I'm too unfamiliar with the possibilities in order to solve this elegantly.
Example:
#file a.py
@cache_by_version
def f(a,b):
#...
@cache_by_version
def g(a,b):
#...
#file b.py
from a import *
def main():
f(1,2)
Running main
in file b.py
once should result in caching of the result of f
with arguments 1
and 2
to disk. Running main
again should bring the result from the cache without evaluating f(1,2)
again. However, if f
changed, then the cache should be invalid. On the other hand, if g
changed, it should not effect the caching of f
.