0

I'm following the book 'Data Science from Scratch' and this is a piece of code in it:

dd_pair = defaultdict(lambda: [0, 0])
dd_pair[2][1] = 1                       # now dd_pair contains {2: [0, 1]}

Can someone please help me understand why and how it works?

khelwood
  • 55,782
  • 14
  • 81
  • 108
Tom
  • 37
  • 4

1 Answers1

1

defaultdict takes a data-type as an initializer. Let's consider we have a dictionary called "users" with "ID" being key and a list as value. We have to check if a "ID" exists in the dictionary, if yes we append something to the list, else we put an empty list in that place.

So with a regular dictionary, we do something like:

users = {}

if "id1" not in users:
    users["id1"] = []
users["id1"].append("log")

Now with defaultdict, all we have to do is to set an initialiser as:

from collections import defaultdict
users = defaultdict(list)  # Any key not existing in the dictionary will get assigned a `list()` object, which is an empty list
users["id1"].append("log")

So coming to your code,

dd_pair = defaultdict(lambda: [0, 0])

This says, any key which doesn't exist in dd_pair will get a list of two elements initialised to 0 as their initial value. So if you just do print(dd_pair["somerandomkey"]) it should print [0,0].

Therefore, dd_pair[2][1] translates roughly to look like this:

dd_pair[2] = [0,0] # dd_pair looks like: {2:[0,0]}
dd_pair[2][1] = 1  # dd_pair looks like: {2:[0,1]}

Why the need for lambda, why not just use [0,0] ?

The defaultdict constructor expects a callable (The constructor actually expects a default_factory, check out Python docs). In extremely simple terms, if we do defaultdict(somevar), somevar() should be valid.

So, if you just pass [0,0] to defaultdict it'll be wrong since [0,0]() is not valid at all. So what you need is a function which returns [0,0], which can be simply implemented using lambda:[0,0]. (To verify, just do (lambda:[0,0])() , it will return [0,0]).

One more way is to create a class for your specific type, which is better explained in this answer: https://stackoverflow.com/a/36320098/

Rahul Bharadwaj
  • 2,555
  • 2
  • 18
  • 29
  • Thanks for your answer, but I already understand this part. What I don't understand is the lambda's purpose; Why can't I just write [0,0] without the 'lambda'? What does it do exactly? – Tom Apr 17 '20 at 18:54
  • Edited the answer to include that explanation. – Rahul Bharadwaj Apr 18 '20 at 05:22