Python defaultdict and lambda

Question

In someone else's code I read the following two lines:

x = defaultdict(lambda: 0)
y = defaultdict(lambda: defaultdict(lambda: 0))

As the argument of defaultdict is a default factory, I think the first line means that when I call x[k] for a nonexistent key k (such as a statement like v=x[k]), the key-value pair (k,0) will be automatically added to the dictionary, as if the statement x[k]=0 is first executed. Am I correct?

And what about y? It seems that the default factory will create a defaultdict with default 0. But what does that mean concretely? I tried to play around with it in Python shell, but couldn't figure out what it is exactly.

Fred Foo · Accepted Answer · 2011-12-07T17:14:55.890

84

I think the first line means that when I call x[k] for a nonexistent key k (such as a statement like v=x[k]), the key-value pair (k,0) will be automatically added to the dictionary, as if the statement x[k]=0 is first executed.

That's right. This is more idiomatically written

x = defaultdict(int)

In the case of y, when you do y["ham"]["spam"], the key "ham" is inserted in y if it does not exist. The value associated with it becomes a defaultdict in which "spam" is automatically inserted with a value of 0.

I.e., y is a kind of "two-tiered" defaultdict. If "ham" not in y, then evaluating y["ham"]["spam"] is like doing

y["ham"] = {}
y["ham"]["spam"] = 0

in terms of ordinary dict.

edited Dec 07 '11 at 17:14

answered Dec 07 '11 at 17:08

Fred Foo

355,277
75
744
836

6

Another way of creating a deafultdict like `y` without using lambda, is using [`partial`](http://docs.python.org/library/functools.html#functools.partial) from `functools`, like this: `y = defaultdict(partial(defaultdict, int))` – Lauritz V. Thaulow Dec 07 '11 at 18:32
1

Quick follow-up: why does `defaultdict(int)` work the same way that `lambda: 0` does? Or, in other words, how come `defaultdict(int)` always returns 0 for the value? – briandk Oct 10 '13 at 23:01
6

@briandk: because `int()` returns zero. – Fred Foo Oct 11 '13 at 08:14

Andrew Clark · Answer 2 · 2011-12-07T17:27:31.857

You are correct for what the first one does. As for y, it will create a defaultdict with default 0 when a key doesn't exist in y, so you can think of this as a nested dictionary. Consider the following example:

y = defaultdict(lambda: defaultdict(lambda: 0))
print y['k1']['k2']   # 0
print dict(y['k1'])   # {'k2': 0}

To create an equivalent nested dictionary structure without defaultdict you would need to create an inner dict for y['k1'] and then set y['k1']['k2'] to 0, but defaultdict does all of this behind the scenes when it encounters keys it hasn't seen:

y = {}
y['k1'] = {}
y['k1']['k2'] = 0

The following function may help for playing around with this on an interpreter to better your understanding:

def to_dict(d):
    if isinstance(d, defaultdict):
        return dict((k, to_dict(v)) for k, v in d.items())
    return d

This will return the dict equivalent of a nested defaultdict, which is a lot easier to read, for example:

>>> y = defaultdict(lambda: defaultdict(lambda: 0))
>>> y['a']['b'] = 5
>>> y
defaultdict(<function <lambda> at 0xb7ea93e4>, {'a': defaultdict(<function <lambda> at 0xb7ea9374>, {'b': 5})})
>>> to_dict(y)
{'a': {'b': 5}}

score 10 · Answer 3 · answered Dec 07 '11 at 17:10

10

defaultdict takes a zero-argument callable to its constructor, which is called when the key is not found, as you correctly explained.

lambda: 0 will of course always return zero, but the preferred method to do that is defaultdict(int), which will do the same thing.

As for the second part, the author would like to create a new defaultdict(int), or a nested dictionary, whenever a key is not found in the top-level dictionary.

answered Dec 07 '11 at 17:10

Kenan Banks

207,056
34
155
173

4

@mjb - int is preferred in this case because it's far more readable. using int is probably also a little faster, but again the primary reason is that it's much clearer code. – Kenan Banks Nov 26 '12 at 23:31
3

Via docs.python.org: "The function int() which always returns zero is just a special case of constant functions. A faster and more flexible way to create constant functions is to use itertools.repeat() which can supply any constant value (not just zero)". An itertools.repeat() example is then shown, which is pretty nice. I recommend reading: http://docs.python.org/2/library/collections.html#defaultdict-objects – Dmitry Minkovsky Feb 20 '13 at 02:26

score 5 · Answer 4 · answered Mar 08 '18 at 23:57

All answers are good enough still I am giving the answer to add more info:

"defaultdict requires an argument that is callable. That return result of that callable object is the default value that the dictionary returns when you try to access the dictionary with a key that does not exist."

Here's an example

SAMPLE= {'Age':28, 'Salary':2000}
SAMPLE = defaultdict(lambda:0,SAMPLE)

>>> SAMPLE
defaultdict(<function <lambda> at 0x0000000002BF7C88>, {'Salary': 2000, 'Age': 28})

>>> SAMPLE['Age']----> This will return 28
>>> SAMPLE['Phone']----> This will return 0   # you got 0 as output for a non existing key inside SAMPLE

score 3 · Answer 5 · answered Jan 13 '15 at 08:43

3

y = defaultdict(lambda:defaultdict(lambda:0))

will be helpful if you try this y['a']['b'] += 1

answered Jan 13 '15 at 08:43

yongsun

105
7

Python defaultdict and lambda

5 Answers5

Linked

Related