88

I have a large list like:

[A][B1][C1]=1
[A][B1][C2]=2
[A][B2]=3
[D][E][F][G]=4

I want to build a multi-level dict like:

A
--B1
-----C1=1
-----C2=1
--B2=3
D
--E
----F
------G=4

I know that if I use recursive defaultdict I can write table[A][B1][C1]=1, table[A][B2]=2, but this works only if I hardcode those insert statement.

While parsing the list, I don't how many []'s I need beforehand to call table[key1][key2][...].

Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331
Wei Shi
  • 4,945
  • 8
  • 49
  • 73
  • Strongly related: https://stackoverflow.com/questions/16547643/convert-a-list-of-delimited-strings-to-a-tree-nested-dict-using-python – cs95 Nov 25 '17 at 09:52

10 Answers10

214

You can do it without even defining a class:

from collections import defaultdict

nested_dict = lambda: defaultdict(nested_dict)
nest = nested_dict()

nest[0][1][2][3][4][5] = 6
Asclepius
  • 57,944
  • 17
  • 167
  • 143
Hugo Walter
  • 3,048
  • 1
  • 16
  • 10
  • 15
    this is sweet! but how about if i want leaves to initialize via a standard (int, list, etc) factory? eg, i want to be able to say: `table[0][1][2][3][4][5] += 1` – rikb Aug 16 '14 at 23:23
  • 1
    is there a way to do the same with a built-in dict and .get() ? – Aleksandr Levchuk Sep 20 '14 at 08:22
  • @rikb: i don't see how without setting a fixed depth (to differentiate the leave-nodes) – Hugo Walter Sep 22 '14 at 07:32
  • 1
    class l(dict): __missing__=lambda a,b:a.setdefault(b,l()) and then continue from table=l() – Hugo Walter Jun 20 '16 at 10:37
  • 1
    PyCharm says it violates PEP 8: "do not assign a lambda expression use a def". Any ways to do this with a function to get rid of the warning? – NaturalBornCamper Feb 09 '19 at 08:48
  • 4
    def nested_dict(): return defaultdict(nested_dict) but i like the lambda version better. it looks a bit more cryptic ;-) – Hugo Walter Mar 12 '19 at 08:04
  • @MonsieurBeilto this gives an intro: https://howchoo.com/g/yjjknjdinmq/nested-defaultdict-python Summary: defaultdict takes an argument: a function that is called if a key doesn't exist when the dict is indexed. The last examples in the link are of providing a function which itself returns a defaultdict which as its argument takes the function dict, returning a normal dict. Repeating this gets some fixed level of nesting. OP's answer is just making this process recursive, so the provided function to the inner default dict is itself our original function, giving arbitrary level of nesting. – Umer Amjad Apr 09 '20 at 22:35
  • What would be "the stopping condition" in this case for the nested_dict? What goes on behind the curtains when working with a lambda? Also, is there an equivalent of this without using lambdas? – M.Ionut Nov 12 '21 at 13:19
20

Your example says that at any level there can be a value, and also a dictionary of sub-elements. That is called a tree, and there are many implementations available for them. This is one:

from collections import defaultdict
class Tree(defaultdict):
    def __init__(self, value=None):
        super(Tree, self).__init__(Tree)
        self.value = value

root = Tree()
root.value = 1
root['a']['b'].value = 3
print root.value
print root['a']['b'].value
print root['c']['d']['f'].value

Outputs:

1
3
None

You could do something similar by writing the input in JSON and using json.load to read it as a structure of nested dictionaries.

Apalala
  • 9,017
  • 3
  • 30
  • 48
  • I think the `value` construct is unnecessary, at least with respect to the proposed problem. Just remove references to `value` and assign values directly to dictionary keys. – Jason R. Coombs Mar 20 '11 at 17:19
  • +1: Although the `value` arg/attribute isn't really necessary. – martineau Mar 20 '11 at 17:31
  • 4
    @Martineau @Jason. The `value` instance variable is necessary because otherwise you'd loose the children when you assign directly to a node (see my comment to Jason's elegant solution). Intervening `__setitem__` would provide for a much more robust solution, but it would be a too complicated solution to the simple requirements. – Apalala Mar 20 '11 at 19:19
  • I was unclear how to modify the other answers to that the collection property was a `list` instead of a `int/float`. This answer makes it clear, where `self.value = []` was exactly what I was looking for! – benjaminmgross Feb 27 '18 at 02:58
14

I think the simplest implementation of a recursive dictionary is this. Only leaf nodes can contain values.

# Define recursive dictionary
from collections import defaultdict
tree = lambda: defaultdict(tree)

Usage:

# Create instance
mydict = tree()

mydict['a'] = 1
mydict['b']['a'] = 2
mydict['c']
mydict['d']['a']['b'] = 0

# Print
import prettyprint
prettyprint.pp(mydict)

Output:

{
  "a": 1, 
  "b": {
    "a": 1
  }, 
  "c": {},
  "d": {
    "a": {
      "b": 0
    }
  }
}
Bouke Versteegh
  • 4,097
  • 1
  • 39
  • 35
11

I'd do it with a subclass of dict that defines __missing__:

>>> class NestedDict(dict):
...     def __missing__(self, key):
...             self[key] = NestedDict()
...             return self[key]
...
>>> table = NestedDict()
>>> table['A']['B1']['C1'] = 1
>>> table
{'A': {'B1': {'C1': 1}}}

You can't do it directly with defaultdict because defaultdict expects the factory function at initialization time, but at initialization time, there's no way to describe the same defaultdict. The above construct does the same thing that default dict does, but since it's a named class (NestedDict), it can reference itself as missing keys are encountered. It is also possible to subclass defaultdict and override __init__.

Jason R. Coombs
  • 41,115
  • 10
  • 83
  • 93
  • That's not enough. You'll get an error if you try `table['A']['B1']['C1']['D2'] = 2`. The nodes must be able to hold a value **and** the children. – Apalala Mar 20 '11 at 17:22
  • 3
    @Apalala: Actually, from the OP's example input, it appears that nodes only need to be able to hold a value **or** children, never both -- which is why @Jason and I claimed your answer's `value` attribute was unnecessary. – martineau Mar 20 '11 at 19:52
  • @martinau MHO is that it all becomes unstable (bug-prone) unless it is solved as a tree. Syntax and implementation are irrelevant. Is it, or is it not a problem that requires a tree structure? My point is that one should not force a design towards a _pretty_ syntax unless there are compelling reasons to do it. KISS. – Apalala Mar 21 '11 at 00:32
  • @Apalala I know this is old. but how do we implement a `defaultdict` that holds both values and children? – Halcyon Abraham Ramirez Dec 28 '16 at 13:45
  • @HalcyonAbrahamRamirez Look at Apalala's answer in this same question. – Jason R. Coombs Dec 29 '16 at 15:02
6

This is equivalent to the above, but avoiding lambda notation. Perhaps easier to read ?

def dict_factory():
   return defaultdict(dict_factory)

your_dict = dict_factory()

Also -- from the comments -- if you'd like to update from an existing dict, you can simply call

your_dict[0][1][2].update({"some_key":"some_value"})

In order to add values to the dict.

gabe
  • 2,521
  • 2
  • 25
  • 37
  • This solution does not offer the ability to pass an initial value. I think Dan O'Huiginn's solution (via Dvd Avins post) is slightly better for this reason. – Scott P. Sep 25 '20 at 20:38
4

Dan O'Huiginn posted a very nice solution on his journal in 2010:

http://ohuiginn.net/mt/2010/07/nested_dictionaries_in_python.html

>>> class NestedDict(dict):
...     def __getitem__(self, key):
...         if key in self: return self.get(key)
...         return self.setdefault(key, NestedDict())


>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}
Dvd Avins
  • 438
  • 5
  • 10
  • 1
    I find this approach nice when I want to create a nested dictionary quickly. If I want to "re-enable" `KeyError`, it's easy to convert back to a standard dictionary using `dict()`. – JS. Nov 13 '17 at 23:18
  • `return self.setdefault(key, NestedDict())` is sufficient. No need for the if. – Scott P. Sep 25 '20 at 20:26
3

You may achieve this with a recursive defaultdict.

from collections import defaultdict

def tree():
    def the_tree():
        return defaultdict(the_tree)
    return the_tree()

It is important to protect the default factory name, the_tree here, in a closure ("private" local function scope). Avoid using a one-liner lambda version, which is bugged due to Python's late binding closures, and implement this with a def instead.

The accepted answer, using a lambda, has a flaw where instances must rely on the nested_dict name existing in an outer scope. If for whatever reason the factory name can not be resolved (e.g. it was rebound or deleted) then pre-existing instances will also become subtly broken:

>>> nested_dict = lambda: defaultdict(nested_dict)
>>> nest = nested_dict()
>>> nest[0][1][2][3][4][6] = 7
>>> del nested_dict
>>> nest[8][9] = 10
# NameError: name 'nested_dict' is not defined
wim
  • 338,267
  • 99
  • 616
  • 750
2

A slightly different possibility that allows regular dictionary initialization:

from collections import defaultdict

def superdict(arg=()):
    update = lambda obj, arg: obj.update(arg) or obj
    return update(defaultdict(superdict), arg)

Example:

>>> d = {"a":1}
>>> sd = superdict(d)
>>> sd["b"]["c"] = 2
Vincent
  • 12,919
  • 1
  • 42
  • 64
2

To add to @Hugo
To have a max depth:

l=lambda x:defaultdict(lambda:l(x-1)) if x>0 else defaultdict(dict)
arr = l(2)
1

You could use a NestedDict.

from ndicts.ndicts import NestedDict

nd = NestedDict()
nd[0, 1, 2, 3, 4, 5] = 6

The result as a dictionary:

>>> nd.to_dict()
{0: {1: {2: {3: {4: {5: 6}}}}}}

To install ndicts

pip install ndicts
edd313
  • 1,109
  • 7
  • 20