Python store dictionary path and read back in

Question

I'm looping over a heavily nested dictionary of lists (system information) and storing the complete path to keys in this format:

.children[0].children[9].children[0].children[0].handle = PCI:0000:01:00.0
.children[0].children[9].children[0].children[0].description = Non-Volatile memory controller
.children[0].children[9].children[0].children[0].product = Samsung Electronics Co Ltd
.children[0].children[9].product = Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D DMI2
.children[2].product = PWS-406P-1R

Next, the complete paths are read in and will be compared to the system information (Data). How can I convert the complete path to this format?

Data['children'][0]['children'][9]['children'][0]['children'][0]['handle']
Data['children'][0]['children'][9]['product]'
Data['children'][2]['product']

I can do something like:

data = re.findall(r"\.([a-z]+)\[(\d+)\]", key, re.IGNORECASE)

[('children', '0'), ('children', '9'), ('children', '0'), ('children', '0')]
[('children', '0'), ('children', '9'), ('children', '0'), ('children', '0')]
[('children', '0'), ('children', '9'), ('children', '0'), ('children', '0')]
[('children', '0'), ('children', '9')]
[('children', '2')]

How can I convert one of these lists of tuples to be able to do:

if Data['children'][2]['product'] == expected:
    print('pass')

Just to confirm - Your first sample is a *text file* that represents nested data? — Tomalak, Apr 11 '18 at 19:25
Why don't you use JSON? Don't roll your own file formats unless you have a really good reason. JSON is fine for what you are storing there, and the tools to read and write JSON are built into Python. — Tomalak, Apr 11 '18 at 19:36

Brendan Abel · Accepted Answer · 2018-04-11T20:15:19.760

You could use itertools, functools, and the operator libraries to chain the indexes together and recursively look them up to get the end value.

First, I think you should change the regex to pick up the last getter (i.e. handle, description, product)

re.findall(r"\.([a-z]+)(?:\[(\d+)\])?", key, re.IGNORECASE)

That should give you this

[('children', '0'), ('children', '9'), ('product', '')]

Then you can do something like this to chain the lookups

import operator
import functools
import itertools

indexes = [('children', '0'), ('children', '9'), ('product', '')]

# This turns the list above into a flat list ['children', 0, 'children', ...]
# It also converts number strings to integers and excludes empty strings.
keys = (int(k) if k.isdigit() else k for k in itertools.chain(*indexes) if k)

# functools.reduce recursively looks up the keys
# operator.getitem() is a functional version of Data[key] == getitem(Data, key)
value = functools.reduce(operator.getitem, keys, Data)
if value == expected:
    pass

score 0 · Answer 2 · answered Apr 11 '18 at 19:41

The simplest what I can think of right now is:

Code:

s = '.children[2].product = PWS-406P-1R'
path, expected = re.sub(r'\.(\w+)', r"['\1']", s).split(' = ')
Data = {'children': ['', '', {'product': 'PWS-406P-1R'}]}

if eval(f'Data{path}') == expected:
    print('pass')

Output:

pass

Notice usage of f-strings, requires Python 3.6+. You can change it to .format() if you like.

score 0 · Answer 3 · answered Apr 11 '18 at 19:43

It could work wih recursive search in the Data structure:

s = """.children[0].children[9].children[0].children[0].handle = PCI:0000:01:00.0
.children[0].children[9].children[0].children[0].description = Non-Volatile memory controller
.children[0].children[9].children[0].children[0].product = Samsung Electronics Co Ltd
.children[0].children[9].product = Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D DMI2
.children[2].product = PWS-406P-1R
"""

import re

Data = {'children': {'0': {'children': {'9': {'children': {'0': {'children': {'0': {'handle': 'value'}}}}}}}}}

for line in s.splitlines():
    l, value = re.split(r'\s*=\s*', line)
    l = l[1:] # Remove first '.'
    keys = re.split(r'[\[\].]+', l)
    print(keys)

    lookup = Data
    for key in keys:
        if key in lookup:
            lookup = lookup[key]
        else:
            print("Key {} not found".format(key))
            raise Exception("Value not found for {}".format(".".join(keys)))

    print("Value found: " + value)

The first split separates the keys from the data (Looking for =)

l = l[1:] removes the first '.'

The second split separate all the fields to a list of keys to access the data.

Then, there is a loop of lookups in the Data structure.

`Data['children']`, `Data['children'][n]['children']`, etc. are lists I assume :P but it wasn't defined in the question ^^ — radzak, Apr 11 '18 at 19:47

score 0 · Answer 4 · answered Apr 11 '18 at 20:39

Here is another regex to attempt doing this:

pattern = re.compile(r'(?:\.(\w+)(?:\[(\d+)\]))|(?:\.(\w+))|(?:\s*=\s*(.+)$)')

path = '.children[0].children[9].children[0].children[0].handle = PCI:0000:01:00.0'

And here is the magic:

>>> map(lambda t: filter(None, t), pattern.findall(path))
[('children', '0'), ('children', '9'), ('children', '0'), ('children', '0'), ('handle',), ('PCI:0000:01:00.0',)]

Take it a step further and flatten the resulting list

>>> import itertools
>>> keys = map(lambda t: filter(None, t), pattern.findall(path))
>>> flatkeys = list(itertools.chain.from_iterable(map(lambda key: (key[0], int(key[1])) if len(key) > 1 else (key[0],), keys[:-1])))
>>> flatkeys
['children', 0, 'children', 9, 'children', 0, 'children', 0, 'handle']
>>> result = keys[-1][0]
>>> result
'PCI:0000:01:00.0'

Now borrowing form this answer

>>> d = dict(children=[dict(children=([{} for _ in range(9)]) + [dict(children=[dict(children=[dict(handle='PCI:0000:01:00.0')])])])])
>>> d
{'children': [{'children': [{}, {}, {}, {}, {}, {}, {}, {}, {}, {'children': [{'children': [{'handle': 'PCI:0000:01:00.0'}]}]}]}]}

>>> from functools import reduce
>>> import operator
>>> assert reduce(operator.getitem, flatkeys, d) == result

Python store dictionary path and read back in

4 Answers4