6

I have a list of words and I would like to store them in a nested dictionary.

Here is the sample list:-

words = ['Apple','Ape','Bark','Barn']

The dictionary that I want to create is as follows:-

{'A':{'P':{'E':{},
           'P':{'L':{'E':{}}}}},
 'B':{'A':{'R':{'K':{},'N':{}}}}}

The words are case insensitive.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Strommer
  • 313
  • 1
  • 2
  • 10

1 Answers1

7

Use a collections.defaultdict() object instead:

from collections import defaultdict

def tree():
    return defaultdict(tree)

nested = defaultdict(tree)

for word in words:
    node = nested
    for char in word:
        node = node[char.upper()]

Whenever you try to access a key in a defaultdict that does not yet exist, the default factory is called to produce a value for that key, transparently. In the above code the default factory is tree(), which produces another defaultdict() with the same factory, letting you build up a nested set of dictionaries merely by accessing the keys.

Demo:

>>> from collections import defaultdict
>>> def tree():
...     return defaultdict(tree)
... 
>>> nested = defaultdict(tree)
>>> words = ['Apple','Ape','Bark','Barn']
>>> for word in words:
...     node = nested
...     for char in word:
...         node = node[char.upper()]
... 
>>> nested
defaultdict(<function tree at 0x114e62320>, {'A': defaultdict(<function tree at 0x114e62320>, {'P': defaultdict(<function tree at 0x114e62320>, {'P': defaultdict(<function tree at 0x114e62320>, {'L': defaultdict(<function tree at 0x114e62320>, {'E': defaultdict(<function tree at 0x114e62320>, {})})}), 'E': defaultdict(<function tree at 0x114e62320>, {})})}), 'B': defaultdict(<function tree at 0x114e62320>, {'A': defaultdict(<function tree at 0x114e62320>, {'R': defaultdict(<function tree at 0x114e62320>, {'K': defaultdict(<function tree at 0x114e62320>, {}), 'N': defaultdict(<function tree at 0x114e62320>, {})})})})})
>>> def print_nested(d, indent=0):
...     for k, v in d.iteritems():
...         print '{}{!r}:'.format(indent * '  ', k)
...         print_nested(v, indent + 1)
... 
>>> print_nested(nested)
'A':
  'P':
    'P':
      'L':
        'E':
    'E':
'B':
  'A':
    'R':
      'K':
      'N':

A defaultdict is a subclass of the standard Python dictionary and apart from materializing values for keys automatically, behaves exactly like a regular dictionary.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thanks for helping out in this. I would also appreciate if you could tell me an efficient a way to traverse this, so that I can check something like words starting with, say, 'bar'. – Strommer Oct 20 '13 at 20:40
  • I wrote about traversing nested ducts before; see http://stackoverflow.com/a/14692747 and http://stackoverflow.com/a/17797566 for examples. – Martijn Pieters Oct 20 '13 at 21:39