26

I'm working with xlwt which has a 4k limit on how many styles can be defined in an excel doc.

Normally, one creates styles like so:

style = xlwt.easyxf("font: bold 1")

Which I simply replaced with

def cached_easyxf(self, format):
    return self._cache.setdefault(format, xlwt.easyxf(format))

Which works perfectly. Now, I've found out that I need to pass in keyword arguments sometimes which got me thinking: how should I hash the args/kwargs signature?

Should I create a cache key based on str(value)? Pickle? What's most robust?

For my situation it looks like I can just convert key/values to strings and add it to my keys... but I'm now curious about a generic way to handle this with say unhashable types like arg=[1, 2, 3]

def cached_call(*args, **kwargs):
    return cache.get(what_here)
cached_call('hello')
cached_call([1, 2, 3], {'1': True})
Yuji 'Tomita' Tomita
  • 115,817
  • 29
  • 282
  • 245

1 Answers1

30

Here is the technique used in functools.lru_cache():

kwd_mark = object()     # sentinel for separating args from kwargs

def cached_call(*args, **kwargs):
    key = args + (kwd_mark,) + tuple(sorted(kwargs.items()))
    return cache.get(key)

Note, the above code handles keyword arguments but makes no attempt to handle non-hashable values like lists. Your idea for using the str or a list is a reasonable start. For set objects, you would need to sort the entries first, str(sorted(someset)). Other objects may not have a useful __repr__ or __str__ (i.e. they may show only the object type and location in memory). In summary, handling arbitrary unhashable arguments requires careful consideration of each object type.

eigenein
  • 2,083
  • 3
  • 25
  • 43
Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
  • Would the following work just as well: `key = (args, tuple(sorted(kwds.items())))`? Or is it slower because of inner tuples? – max Sep 20 '12 at 11:41
  • @max Yes, that would work. And yes, the extra pointers would slow it down a bit as well as consuming a bit more memory. – Raymond Hettinger Sep 20 '12 at 22:10
  • 3
    Can also do `(args, frozenset(kwds.items()))` – augurar May 06 '15 at 22:48
  • @Raymond is kwds.items() suppose to be kwargs.items() ? I don't see kwds defined anywhere – derchambers Aug 06 '16 at 21:51
  • With nested dictionaries, consider using `json.dumps(d, sort_keys=True)` to serialize dicts (whether in args or kwargs) - [source](https://stackoverflow.com/a/22003440/484127) – tutuDajuju Dec 27 '17 at 19:49
  • @RaymondHettinger what is the best way to make custom lru_cache. If I will use for key representation from your example and result = f(*args, **kwargs). Then what to use, OrderedDict or Deque? – Evgeny Oct 25 '19 at 16:45
  • 1
    You need to map *args, **kwargs to some arbitrary function signature that you scrape via inspect.getfullargspec ... this is the generic pattern to cache a wrapped function. I think there are some corner cases missed by the above. Consider default arguments etc. – mathtick Mar 05 '20 at 15:59