0

I have the following code:

from collections import *
nested_dict = lambda: defaultdict(nested_dict)
data = nested_dict()

which enables me to write any new "path" in the dict as a one liner:

data['A']['B']['C']=3

which is what I want. But I want to get an exception when running (for any non existing path):

var = data['A']['XXX']['C']

I feel I need defaultdict when writing, plain dict when reading...

Or, is there a simple nice way to check if a 'path' exists in a defaultdict without modifying its contents...

I tried converting the defaultdict back to a dict before the lookup, hoping that:

dict(data)['A']['XXX']['C']

would raise a exception... but it kept creating missing keys...

user1159290
  • 951
  • 9
  • 27
  • `dict(data)` is a *shallow* copy. – Martijn Pieters Jan 08 '19 at 17:57
  • None of the objects involved have any way to distinguish between `var = data['A']['XXX']['C']`, which you want an exception for, and the first steps of `data['A']['XXX']['C']['D'] = 4`, which shouldn't throw. – user2357112 Jan 08 '19 at 18:04

2 Answers2

1

An obvious solution is to just use plain dicts with a function that can "materialize" the intermediate keys:

def write_path(d, path, value):
    for key in path[:-1]:
        d = d.setdefault(key, {})
    d[path[-1]] = value

d = {}

write_path(d, ['a', 'b', 'c'], 3)
print(d)
print(d['a']['b']['c'])
print(d['a']['b']['d'])

outputs

{'a': {'b': {'c': 3}}}
3
Traceback (most recent call last):
  File "writedefaultdict.py", line 11, in <module>
    print(d['a']['b']['d'])
KeyError: 'd'
AKX
  • 152,115
  • 15
  • 115
  • 172
1

You can't distingsuish between lookups and writes here, because it is the lookups that create your intermediary structure in the data['A']['B']['C'] = 3 assignment. Python executes the indexing operations data['A'] and then ['B'] first, before assigning to the 'C' key. The __getitem__, __setitem__ and __missing__ hooks involved to make that work are not given enough context to distinguish between access that then leads to the 'C' assignment from only 'reading' 'XXX' in your second example.

You really only have 3 options here:

  • Don't use defaultdict. When writing, explicitly create new nested dictionaries with dict.setdefault() instead; you can chain these calls as needed:

    var = {}
    var.setdefault('A', {}).setdefault('B', {})['C'] = 3
    

    or you can wrap recursive behaviour in a few functions.

  • Create a recursive copy of your defaultdict structure to replace it with a dict structure once you are done writing:

    def dd_to_d(dd):
        r = {}
        stack = [(r, dd)]
        while stack:
            target, dd = stack.pop()
            for key, value in dd.items():
                if isinstance(value, defaultdict):
                    sub = {}
                    stack.append((sub, value))
                    value = sub
                target[key] = value
        return r
    
    var = dd_to_d(var)
    
  • Set all the default_factory attributes to None to disable creating new values for missing keys:

    def disable_dd(dd):
        stack = [dd]
        while stack:
            dd = stack.pop()
            dd.default_factory = None
            for key, value in dd.items():
                if isinstance(value, defaultdict):
                    stack.append(value)
    
    disable_dd(var)
    
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thx. this are just work around... But maybe the problem has no better solution. I'll mark the answer as usefull (to give your effort credits), but leave room for a final solution if ever someone comes with a nicer proposal. – user1159290 Jan 09 '19 at 07:15
  • @user1159290: I'd not call this a work-around. If you are using defaultdict, then your only options are to copy the structure over to dictionaries or disable the factory. The alternative option is to not use defaultdict, with `dict` being the obvious choice in that case. – Martijn Pieters Jan 09 '19 at 12:59
  • ok. no-one having any better solution to suggest, I have accepted your 'solution'. But really that is not as good as I had hope for (but maybe best). In my case multiple assignments and lookups are interleaved,which makes none of the proposed 'solutions' look nice. – user1159290 Jan 10 '19 at 12:52
  • @user1159290: I realised I didn't quite explain why these were your options. I've added that to the answer now. – Martijn Pieters Jan 10 '19 at 12:59
  • @matijnp : thx. I realized what happened (lookups being done during the assignment) when it happened, of course, but still hoped something I did not know of could be done towards a better solution. Thx anyway – user1159290 Jan 10 '19 at 14:36