I have a pipeline that contains several variations of the following data class, which I've tried to maximally simplify for this example:
from pathlib import Path
class Data(object):
def __init__(self):
self.filepath = Path("filepath")
def download(self, use_cache=True):
if use_cache and self.filepath.exists():
return self.filepath
# Code to download
print(f"downloading to {self.filepath}")
I'd like to perform the download only if (a) the file doesn't exist, and (b) the user provides the arg use_cache = False
. As mentioned, I have several classes with similar download
methods that only vary in the # Code to download
portion. I'd like to find a way to make the caching logic generic for all of these.
I was thinking about using a decorator:
from functools import wraps
def cache(filepath, use_cache):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
if filepath.exists() and use_cache:
return filepath
func(*args, **kwargs)
return wrapper
return decorator
However, I'm not sure how to pass the filepath
and use_cache
arguments to the cache decorator. Am I going about this completely wrong? How else could I solve this problem?