2

I would like to get the path of all the keys of a nested dict in a list. For example if my dict looks like below

{
"persons": [{
    "id": "f4d322fa8f552",
    "address": {
        "building": "710",
        "coord": "[123, 465]",
        "street": "Avenue Road",
        "zipcode": "12345"
    },
    "cuisine": "Chinese",
    "grades": [{
        "date": "2013-03-03T00:00:00.000Z",
        "grade": "B",
        "score": {
          "x": 3,
          "y": 2
        }
    }, {
        "date": "2012-11-23T00:00:00.000Z",
        "grade": "C",
        "score": {
          "x": 1,
          "y": 22
        }
    }],
    "name": "Shash"
}]
}

I would like to get the path like path = [['persons'], ['persons','id'],['persons','address'],['persons','address','building']...] up to the last key.

I tried to traverse the entire dict to append the path variable. tried to get some inspiration from Print complete key path for all the values of a python nested dictionary but I am unable to get the paths which are inside list.

Are there any other possible ways to get to this.

Shash
  • 4,160
  • 8
  • 43
  • 67
  • What do you mean get the `path`, do you mean recursively going through the data structure? Can you show what you have tried? – AChampion Jul 17 '18 at 04:21
  • 1
    But `id` is actually within a list of `persons`, and not a key of `persons` as a dict. Do you not want the fact that `id` is in a list to be accounted for? Otherwise the path in your expected output would be rather useless. – blhsing Jul 17 '18 at 04:22
  • `d['persons']['id']` would yield an error, not sure is a path? – rafaelc Jul 17 '18 at 04:25
  • Thats a duplicate: https://stackoverflow.com/questions/34836777/print-complete-key-path-for-all-the-values-of-a-python-nested-dictionary – rafaelc Jul 17 '18 at 04:31

2 Answers2

5

You can recursively describe the data structure, here's one approach that uses a queue q vs recursion. But it is hard to tell if this is what you are looking for because it shows the lists indexes, but they can be excluded easily enough:

def get_paths(d):
    q = [(d, [])]
    while q:
        n, p = q.pop(0)
        yield p
        if isinstance(n, dict):
            for k, v in n.items():
                q.append((v, p+[k]))
        elif isinstance(n, list):
            for i, v in enumerate(n):
                q.append((v, p+[i]))   # Change to q.append((v, p)) to remove index

In []:
list(get_paths(d))

Out[]:
[[],
 ['persons'],
 ['persons', 0],
 ['persons', 0, 'id'],
 ['persons', 0, 'address'],
 ['persons', 0, 'cuisine'],
 ['persons', 0, 'grades'],
 ['persons', 0, 'name'],
 ['persons', 0, 'address', 'building'],
 ['persons', 0, 'address', 'coord'],
 ['persons', 0, 'address', 'street'],
 ['persons', 0, 'address', 'zipcode'],
 ['persons', 0, 'grades', 0],
 ['persons', 0, 'grades', 1],
 ['persons', 0, 'grades', 0, 'date'],
 ['persons', 0, 'grades', 0, 'grade'],
 ['persons', 0, 'grades', 0, 'score'],
 ['persons', 0, 'grades', 1, 'date'],
 ['persons', 0, 'grades', 1, 'grade'],
 ['persons', 0, 'grades', 1, 'score'],
 ['persons', 0, 'grades', 0, 'score', 'x'],
 ['persons', 0, 'grades', 0, 'score', 'y'],
 ['persons', 0, 'grades', 1, 'score', 'x'],
 ['persons', 0, 'grades', 1, 'score', 'y'],
AChampion
  • 29,683
  • 4
  • 59
  • 75
2

You can use recursion with a generator expression:

def get_paths(d, current = []):
  for a, b in d.items():
    yield current+[a]
    if isinstance(b, dict):
      yield from get_paths(b, current+[a])
    elif isinstance(b, list):
      for i in b:
        yield from get_paths(i, current+[a])

final_result = list(get_paths(d))
new_result = [a for i, a in enumerate(final_result) if a not in final_result[:i]]

Output:

[['persons'], ['persons', 'id'], ['persons', 'address'], ['persons', 'address', 'building'], ['persons', 'address', 'coord'], ['persons', 'address', 'street'], ['persons', 'address', 'zipcode'], ['persons', 'cuisine'], ['persons', 'grades'], ['persons', 'grades', 'date'], ['persons', 'grades', 'grade'], ['persons', 'grades', 'score'], ['persons', 'grades', 'score', 'x'], ['persons', 'grades', 'score', 'y'], ['persons', 'name']]
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
  • 1
    Just have to be careful, you can only run this function once: https://stackoverflow.com/questions/1132941/least-astonishment-and-the-mutable-default-argument – AChampion Jul 17 '18 at 04:41