13

I have nested dictionaries:

{'key0': {'attrs': {'entity': 'p', 'hash': '34nj3h43b4n3', 'id': '4130'},
          u'key1': {'attrs': {'entity': 'r',
                              'hash': '34njasd3h43b4n3',
                              'id': '4130-1'},
                    u'key2': {'attrs': {'entity': 'c',
                                        'hash': '34njasd3h43bdsfsd4n3',
                                        'id': '4130-1-1'}}},
          u'key3': {'attrs': {'entity': 'r',
                              'hash': '34njasasasd3h43b4n3',
                              'id': '4130-2'},
                    u'key4': {'attrs': {'entity': 'c',
                                        'hash': '34njawersd3h43bdsfsd4n3',
                                        'id': '4130-2-1'}},
                    u'key5': {'attrs': {'entity': 'c',
                                        'hash': '34njawersd3h43bdsfsd4n3',
                                        'id': '4130-2-2'}}}},
 'someohterthing': 'someothervalue',
 'something': 'somevalue'}
                                                            
      

given an id - one of all the ids like 4130 to 4130-2-2.
whats the easiest way to navigate to the correct dictionary?

If the given id is 4130-2-1 then it should reach the dictionary with key=key5.

No XML approaches please.

Edit(1): The nesting is between 1 to 4 levels, but I know the nesting before I parse.

Edit(2): Fixed the code.

Edit(3): Fixed code again for string values of ids. Please excuse for the confusion created. This is final I hope :)

JV.
  • 2,658
  • 4
  • 24
  • 36
  • for '4130-2-1' you want 'key4', not 'key5' right? 'key5' appears to contain '4130-2-2'. – Josh Petitt Jun 26 '14 at 23:27
  • **See also:** https://stackoverflow.com/questions/7681301/search-for-a-key-in-a-nested-python-dictionary https://stackoverflow.com/a/16508328/42223 – dreftymac Oct 30 '17 at 19:55

7 Answers7

16

If you want to solve the problem in a general way, no matter how many level of nesting you have in your dict, then create a recursive function which will traverse the tree:

def traverse_tree(dictionary, id=None):
    for key, value in dictionary.items():
        if key == 'id':
            if value == id:
                print dictionary
        else:
             traverse_tree(value, id)
    return

>>> traverse_tree({1: {'id': 2}, 2: {'id': 3}}, id=2)
{'id': 2}
Mapad
  • 8,407
  • 5
  • 41
  • 40
  • I have voted you up, don't know how to select 2 answers otherwise I would have selected this one also. :) – JV. Dec 19 '08 at 13:18
15

Your structure is unpleasantly irregular. Here's a version with a Visitor function that traverses the attrs sub-dictionaries.

def walkDict( aDict, visitor, path=() ):
    for  k in aDict:
        if k == 'attrs':
            visitor( path, aDict[k] )
        elif type(aDict[k]) != dict:
            pass
        else:
            walkDict( aDict[k], visitor, path+(k,) )

def printMe( path, element ):
    print path, element

def filterFor( path, element ):
    if element['id'] == '4130-2-2':
        print path, element

You'd use it like this.

walkDict( myDict, filterFor )

This can be turned into a generator instead of a Visitor; it would yield path, aDict[k] instead of invoking the visitor function.

You'd use it in a for loop.

for path, attrDict in walkDictIter( aDict ):
    # process attrDict...
S.Lott
  • 384,516
  • 81
  • 508
  • 779
  • I have a huge collection of these, if you can suggest a better structure with arbitrary level support, ease of insertion and retrieval, that will be great. By the time you figure that structure I will try your solution. Thanks. – JV. Dec 19 '08 at 12:54
  • 3
    @JV: The internal "attrs" dictionaries are ill-advised. Those a candidates for being objects of some defined class, not just anonymous dictionaries. – S.Lott Dec 19 '08 at 13:04
9

This kind of problem is often better solved with proper class definitions, not generic dictionaries.

class ProperObject( object ):
    """A proper class definition for each "attr" dictionary."""
    def __init__( self, path, attrDict ):
        self.path= path
        self.__dict__.update( attrDict )
    def __str__( self ):
        return "path %r, entity %r, hash %r, id %r" % (
            self.path, self.entity, self.hash, self.id )

masterDict= {} 
def builder( path, element ):
    masterDict[path]= ProperObject( path, element )

# Use the Visitor to build ProperObjects for each "attr"
walkDict( myDict, builder )

# Now that we have a simple dictionary of Proper Objects, things are simple
for k,v in masterDict.items():
    if v.id == '4130-2-2':
        print v

Also, now that you have Proper Object definitions, you can do the following

# Create an "index" of your ProperObjects
import collections
byId= collections.defaultdict(list)
for k in masterDict:
    byId[masterDict[k].id].append( masterDict[k] )

# Look up a particular item in the index
print map( str, byId['4130-2-2'] )
S.Lott
  • 384,516
  • 81
  • 508
  • 779
  • If you do a lot of lookups, the cost to transform to Objects and then to an index on 'id' is amortized over the lookups. Building the objects is O(n). Building the index is O(n) and can be done as the objects are being built. Lookup on id is O(1). – S.Lott Dec 19 '08 at 20:56
5

This is an old question but still a top google result, so I'll update:

A friend and myself published a library to solve (very nearly) this exact problem. dpath-python (no relation to the perl dpath module which does similar things).

http://github.com/akesterson/dpath-python

All you would need to do is something like this:

$ easy_install dpath
>>> import dpath.util
>>> results = []
>>> for (path, value) in dpath.util.search(my_dictionary, "*/attrs/entity/4130*", yielded=True):
>>> ... parent = dpath.util.search("/".join(path.split("/")[:-2])
>>> ... results.append(parent)

... that would give you a list of all the dictionary objects that matched your search, i.e., all the objects that had (key = 4130*). The parent bit is a little janky, but it would work.

2

Since recursion is known to be limited in python (see What is the maximum recursion depth in Python, and how to increase it?) I would rather have a loop based answer to this question, so the answer can be adapted to any level of depth in the dictionary. For that, the function

def walkDict( aDict, visitor, path=() ):
    for  k in aDict:
        if k == 'attrs':
            visitor( path, aDict[k] )
        elif type(aDict[k]) != dict:
            pass
        else:
            walkDict( aDict[k], visitor, path+(k,) )

Can be replaced by:

def walkDictLoop(aDict, visitor, path=()):
    toProcess = [(aDict, path)]
    while toProcess:
        dictNode, pathNode = toProcess.pop(0)
        for k in dictNode:
            if k == 'attrs':
                visitor(pathNode, dictNode[k])
            if isinstance(dictNode[k], dict):
                toProcess.append( (dictNode[k], pathNode+(k,)) )
Community
  • 1
  • 1
eguaio
  • 3,754
  • 1
  • 24
  • 38
0

Well, if you have to do it only a few times, you can just use nested dict.iteritems() to find what you are looking for.

If you plan to do it several times, performances will quickly becomes an issue. In that case you could :

  • change the way you data is returned to you to something more suitable.

  • if you can't, convert the data once the fly to a dict between id and keys (using iteritems). Then use it.

Bite code
  • 578,959
  • 113
  • 301
  • 329
  • the idea when we created this structure was to access it through keys - like - key1, key2, etc. Now i stumbled upon a requirement to access thru ids. The second bullet point is a nice suggestion though, will try that. – JV. Dec 19 '08 at 12:23
0

I believe pydash will give you the most efficient way to achieve this.

For example:

data = {'a': {'b': {'c': [0, 0, {'d': [0, {1: 2}]}]}}, 'names': {'first': 'gus', 'second': 'parvez'}}

pydash.get(data, 'a.b.c.2.d.1.[1]')

# output: 2

Detail documentation you can find here: https://pydash.readthedocs.io/en/latest/quickstart.html

Parvez Khan
  • 537
  • 7
  • 15