Let's say I have a referentially transparent function. It is very easy to memoize it; for example:
def memoize(obj):
memo = {}
@functools.wraps(obj)
def memoizer(*args, **kwargs):
combined_args = args + (kwd_mark,) + tuple(sorted(kwargs.items()))
if combined_args not in memo:
memo[combined_args] = obj(*args, **kwargs)
return cache[combined_args]
return memoizer
@memoize
def my_function(data, alpha, beta):
# ...
Now suppose that the data
argument to my_function
is huge; say, it's a frozenset
with millions of elements. In this case, the cost of memoization is prohibitive: every time, we'd have to calculate hash(data)
as part of the dictionary lookup.
I can make the memo
dictionary an attribute to data
instead of an object inside memoize
decorator. This way I can skip the data
argument entirely when doing the cache lookup since the chance that another huge frozenset
will be the same is negligible. However, this approach ends up polluting an argument passed to my_function
. Worse, if I have two or more large arguments, this won't help at all (I can only attach memo
to one argument).
Is there anything else that can be done?