41

One minor annoyance with dict.setdefault is that it always evaluates its second argument (when given, of course), even when the first the first argument is already a key in the dictionary.

For example:

import random
def noisy_default():
    ret = random.randint(0, 10000000)
    print 'noisy_default: returning %d' % ret
    return ret

d = dict()
print d.setdefault(1, noisy_default())
print d.setdefault(1, noisy_default())

This produces ouptut like the following:

noisy_default: returning 4063267
4063267
noisy_default: returning 628989
4063267

As the last line confirms, the second execution of noisy_default is unnecessary, since by this point the key 1 is already present in d (with value 4063267).

Is it possible to implement a subclass of dict whose setdefault method evaluates its second argument lazily?


EDIT:

Below is an implementation inspired by BrenBarn's comment and Pavel Anossov's answer. While at it, I went ahead and implemented a lazy version of get as well, since the underlying idea is essentially the same.

class LazyDict(dict):
    def get(self, key, thunk=None):
        return (self[key] if key in self else
                thunk() if callable(thunk) else
                thunk)


    def setdefault(self, key, thunk=None):
        return (self[key] if key in self else
                dict.setdefault(self, key,
                                thunk() if callable(thunk) else
                                thunk))

Now, the snippet

d = LazyDict()
print d.setdefault(1, noisy_default)
print d.setdefault(1, noisy_default)

produces output like this:

noisy_default: returning 5025427
5025427
5025427

Notice that the second argument to d.setdefault above is now a callable, not a function call.

When the second argument to LazyDict.get or LazyDict.setdefault is not a callable, they behave the same way as the corresponding dict methods.

If one wants to pass a callable as the default value itself (i.e., not meant to be called), or if the callable to be called requires arguments, prepend lambda: to the appropriate argument. E.g.:

d1.setdefault('div', lambda: div_callback)

d2.setdefault('foo', lambda: bar('frobozz'))

Those who don't like the idea of overriding get and setdefault, and/or the resulting need to test for callability, etc., can use this version instead:

class LazyButHonestDict(dict):
    def lazyget(self, key, thunk=lambda: None):
        return self[key] if key in self else thunk()


    def lazysetdefault(self, key, thunk=lambda: None):
        return (self[key] if key in self else
                self.setdefault(key, thunk()))
kjo
  • 33,683
  • 52
  • 148
  • 265
  • You can't make it not evaluate the second argument. What you'd have to do is wrap that argument in a function (e.g., with `lambda`) and then have `setdefault` call the function only if needed. – BrenBarn Jul 08 '13 at 17:53
  • 1
    Can I suggest you add `*args, **kwargs` to the signatures of `lazyget`, `lazysetdefault`, and the call to `thunk()`? This would allow your lazy stuff to take parameters. e.g. `lbd.lazysetdefault('total', sum, [1, 2, 3, 4], start=2)` – Hounshell Dec 29 '16 at 19:34

4 Answers4

25

This can be accomplished with defaultdict, too. It is instantiated with a callable which is then called when a nonexisting element is accessed.

from collections import defaultdict

d = defaultdict(noisy_default)
d[1] # noise
d[1] # no noise

The caveat with defaultdict is that the callable gets no arguments, so you can not derive the default value from the key as you could with dict.setdefault. This can be mitigated by overriding __missing__ in a subclass:

from collections import defaultdict

class defaultdict2(defaultdict):
    def __missing__(self, key):
        value = self.default_factory(key)
        self[key] = value
        return value

def noisy_default_with_key(key):
    print key
    return key + 1

d = defaultdict2(noisy_default_with_key)
d[1] # prints 1, sets 2, returns 2
d[1] # does not print anything, does not set anything, returns 2

For more information, see the collections module.

Santtu Pajukanta
  • 1,371
  • 1
  • 12
  • 15
14

You can do that in a one-liner using a ternary operator:

value = cache[key] if key in cache else cache.setdefault(key, func(key))

If you are sure that the cache will never store falsy values, you can simplify it a little bit:

value = cache.get(key) or cache.setdefault(key, func(key))
Cesar Canassa
  • 18,659
  • 11
  • 66
  • 69
  • 8
    If you're checking `key in dict` there's no point in using `setdeault` – user1685095 Aug 11 '17 at 23:25
  • 3
    This will require to search `key` in `cache` twice. Which is not a big deal for dict based on Hash-Map, but still doesn't make so mach sense. – Roee Gavirel Oct 11 '17 at 07:05
  • 1
    @user1685095 If you don't call setdefault the cache won't be updated. setdefault is both setting the empty cache and returning its value at the same time – Cesar Canassa Jan 31 '18 at 16:37
  • Not looking up the key twice would be a micro-optimisation from O( "2" n) to O(n). Python's probably the wrong choice if this is an issue. https://stackoverflow.com/questions/25777714/which-algorithm-is-faster-on-or-o2n – c z Mar 11 '20 at 10:28
  • @cz "Not looking up the key twice would be a micro-optimisation from O( "2" n) to O(n)", dicts are supposed to be `O(1)` for most purposes (assuming no collisions). – ustulation Feb 08 '23 at 15:35
11

No, evaluation of arguments happens before the call. You can implement a setdefault-like function that takes a callable as its second argument and calls it only if it is needed.

Pavel Anossov
  • 60,842
  • 14
  • 151
  • 124
0

There seems to be no one-liner that doesn't require an extra class or extra lookups. For the record, here is a easy (even not concise) way of achieving that without either of them.

try:
    value = dct[key]
except KeyError:
    value = noisy_default()
    dct[key] = value
return value
Dennis Golomazov
  • 16,269
  • 5
  • 73
  • 81