Updating Variables Pointing to the Same Dictionary in Python

Question

My question emerged when looking here to find how to create a trie in python. The following code was given in the top-voted answer:

>>> _end = '_end_'
>>> 
>>> def make_trie(*words):
...     root = dict()
...     for word in words:
...         current_dict = root
...         for letter in word:
...             current_dict = current_dict.setdefault(letter, {})
...         current_dict[_end] = _end
...     return root
... 
>>> make_trie('foo', 'bar', 'baz', 'barz')
{'b': {'a': {'r': {'_end_': '_end_', 'z': {'_end_': '_end_'}}, 
'z': {'_end_': '_end_'}}}, 'f': {'o': {'o': {'_end_': '_end_'}}}}

I don't understand what purpose the line "current_dict = root" serves; seems like deleting that line and substituting all current_dict with root would do the same thing. (This same thought is expressed in this reply but with no answer.) I know this actually doesn't work as I tried it and an empty dictionary was returned.

I also tried putting print statements in the second for loop to see how current_dict and root were updated. I thought that since they were set to be equal, they referred to the same dictionary and would be updated simultaneously, but that wasn't the case.

Clearly, I have a fundamental misunderstanding of this interaction. Help?

score 0 · Accepted Answer · answered May 10 '17 at 12:04

0

You have to reset current_dict = root for each single word, because current_dict = current_dict.setdefault(letter, {}) sets current_dict to a new empty dictionary or existing sub-dictionary of root if the key is already in the dict.

dict.setdefault(k, d) is a bit tricky because it does two things at the same time. It works just like dict.get and returns the value of the key k if it exists, otherwise the default value d. And if the key doesn't exist it also inserts it with d as the value.

So, as you see, current_dict is not always the root dict, but also refers to the sub-dicts when you're iterating over the letters in the word, and you have to reset it to root to start again at the top level.

answered May 10 '17 at 12:04

skrx

19,980
5
34
48

Thanks for the answer! So before the first iteration of the second for loop, 'current_dict = root = {}'. So far so good. After the first iteration, 'current_dict = {}' and 'root = {'f': {}}'. This part still confuses me. My understanding is that 'setdefault()' first changes the value of what 'current_dict' is pointing to (the same thing that 'root' is pointing to, which is why 'root' is updated) and then returns a value, which 'current_dict' is now set to point to, in this case the '{}' in 'root' with key 'f'. – tobakudan May 10 '17 at 13:44
But why doesn't 'current_dict' merely become an empty dictionary, not the specific one in 'root' with key 'f'? Because 'setdefault()' just returns '{}', not the particular '{}' in 'root' with key 'f', right? – tobakudan May 10 '17 at 13:44
`current_dict` is just a name that is attached to an object. At the start of the function both names `root` and `current_dict` are attached to the same "root" dict. Then setdefault inserts a new key and empty dict and also returns this empty dict and assigns it to the name `current_dict`. Now the two names `root` and `current_dict` refer to different objects: `root` still refers to the root dict and `current_dict` refers to the new empty sub-dict that has just been inserted. – skrx May 10 '17 at 15:28
Ned Batchelder gave a nice talk about this topic (if the names/values are the reason for the confusion): [Facts and Myths about Python names and values](https://www.youtube.com/watch?v=_AEJHKGk9ns) – skrx May 10 '17 at 16:26

Updating Variables Pointing to the Same Dictionary in Python

1 Answers1