4

I have a dictionary like:

{
   "checksum": "b884cbfb1a6697fa9b9eea9cb2054183",
   "roots": {
      "bookmark_bar": {
         "children": [ {
            "date_added": "12989159740428363",
            "id": "4",
            "name": "test2",
            "type": "url",
            "url": "chrome://bookmarks/#1"
         } ],
         "date_added": "12989159700896551",
         "date_modified": "12989159740428363",
         "id": "1",
         "name": "bookmark_bar",
         "type": "folder"
      },
      "other": {
         "children": [ {
            "date_added": "12989159740428363",
            "id": "4",
            "name": "test",
            "type": "url",
            "url": "chrome://bookmarks/#1"
         } ],
         "date_added": "12989159700896557",
         "date_modified": "0",
         "id": "2",
         "name": "aaa",
         "type": "folder"
      },
      "synced": {
         "children": [  ],
         "date_added": "12989159700896558",
         "date_modified": "0",
         "id": "3",
         "name": "bbb",
         "type": "folder"
      }
   },
   "version": 1
}

Everything starts at 'roots', them there are two types of data: URL and folder, they are dictionaries. If it is a folder, it must have the key 'children', the value of the key is a list, we can put more URLs and folders in it.

Now I want to traverse this nested dictionary, to get the URL in all sub-folder, so I wrote a function:

def traverse(dic):
    for i in dic:
        if i['type'] == 'folder':
            for j in traverse(i['children']):
                yield j
        elif i['type'] == 'url':
            yield i

and I can use it like that:

traverse(dictionary['roots']['bookmark_bar']['children'])

It works perfectly. But it just generate a dictionary of a URL, I don't know where is it. I want to get the path too. How can I do it?

dreftymac
  • 31,404
  • 26
  • 119
  • 182
比尔盖子
  • 2,693
  • 5
  • 37
  • 53
  • 4
    Could you please format the dictionary by using idention? And could you please remove everything from it that's not necessary to understand your question? –  Aug 13 '12 at 07:35
  • The dictionary is readable now. – 比尔盖子 Aug 13 '12 at 07:45
  • **See also:** https://stackoverflow.com/questions/7681301/search-for-a-key-in-a-nested-python-dictionary https://stackoverflow.com/a/16508328/42223 – dreftymac Oct 30 '17 at 19:56

2 Answers2

11

I had a slightly different use case from you: I needed to flatten a variable depth JSON structure representing client settings into key-value pairs for storage in a database. I couldn't get jsbueno's answer to work, and since I also needed something that could handle cases without children being explicitly listing or contained, I modified it to suit my needs:

def traverse(dic, path=None):
    if not path:
        path=[]
    if isinstance(dic,dict):
        for x in dic.keys():
            local_path = path[:]
            local_path.append(x)
            for b in traverse(dic[x], local_path):
                 yield b
    else: 
        yield path,dic

The end result is i can pass in a JSON string like this to my script (with variable depths), which converts it to nested dicts:

{
  "servers": {
    "uat": {
      "pkey": true,
      "user": "testval",
      "pass": true
    },
    "dev": {
      "pkey": true,
      "user": "testval",
      "pass": true
    }
  }
}

running the generator above against it creates a list that pretty-prints like this:

([u'servers', u'uat', u'pkey'], True)
([u'servers', u'uat', u'user'], u'testval')
([u'servers', u'uat', u'pass'], True)
([u'servers', u'dev', u'pkey'], True)
([u'servers', u'dev', u'user'], u'testval')
([u'servers', u'dev', u'pass'], True)

Which, using something like:

for x in traverse(outobj):
    pprint(('.'.join(x[0]),x[1]))

can then be transformed into the key-value pair format I want like so:

(u'servers.uat.pkey', True)
(u'servers.uat.user', u'testval')
(u'servers.uat.pass', True)
(u'servers.dev.pkey', True)
(u'servers.dev.user', u'testval')
(u'servers.dev.pass', True)

I know I'm posting this way after the accepted answer was accepted, but since the accepted answer didn't work for me, maybe this slight more structure-agnostic version will help someone else!

Ketzak
  • 620
  • 4
  • 14
  • 1
    And it did help someone else !! Thanks a lot :) I couldn't find the right way to save recursively the entire path to each value. The local_path=path[:] was the key (pun intended :) – Romain Jun 26 '20 at 07:47
1

Not shure if I got what you want, but you might want to do this:

def traverse(dic, path=None):
    if not path:
        path = []
    for i in dic:
        local_path = path[:].append(i)
        if i['type'] == 'folder':
            for j in traverse(i['children'], local_path):
                yield j, local_path
        elif i['type'] == 'url':
            yield i, local_path

Now your function yields the item and a sequence of the keys to get to the item at a certain location.

jsbueno
  • 99,910
  • 10
  • 151
  • 209
  • I was thinking along these lines. This will still need work, as it will suffer `TypeError`'s at `i['type']` when i is not a dict ( which is often :-) ) – azhrei Aug 14 '12 at 00:54
  • Great. It's doesn't work perfectly, but give me an idea. Thanks! – 比尔盖子 Aug 14 '12 at 14:34