0

I have a function that takes a key and traverses nested dicts to return the value regardless of its depth. However, I can only get the value to print, not return. I've read the other questions on this issue and and have tried 1. implementing yield 2. appending the value to a list and then returning the list.

def get_item(data,item_key):
    # data=dict, item_key=str
    if isinstance(data,dict):
        if item_key in data.keys():
            print data[item_key]
            return data[item_key]
        else:
            for key in data.keys():
                # recursion
                get_item(data[key],item_key)

item = get_item(data,'aws:RequestId')
print item

Sample data:

data = OrderedDict([(u'aws:UrlInfoResponse', OrderedDict([(u'@xmlns:aws', u'http://alexa.amazonaws.com/doc/2005-10-05/'), (u'aws:Response', OrderedDict([(u'@xmlns:aws', u'http://awis.amazonaws.com/doc/2005-07-11'), (u'aws:OperationRequest', OrderedDict([(u'aws:RequestId', u'4dbbf7ef-ae87-483b-5ff1-852c777be012')])), (u'aws:UrlInfoResult', OrderedDict([(u'aws:Alexa', OrderedDict([(u'aws:TrafficData', OrderedDict([(u'aws:DataUrl', OrderedDict([(u'@type', u'canonical'), ('#text', u'infowars.com/')])), (u'aws:Rank', u'1252')]))]))])), (u'aws:ResponseStatus', OrderedDict([(u'@xmlns:aws', u'http://alexa.amazonaws.com/doc/2005-10-05/'), (u'aws:StatusCode', u'Success')]))]))]))])

When I execute, the desired value prints, but does not return:

>>>52c7e94b-dc76-2dd6-1216-f147d991d6c7
>>>None

What is happening? Why isn't the function breaking and returning the value when it finds it?

Benjamin James
  • 941
  • 1
  • 9
  • 24

3 Answers3

4

A simple fix, you have to find a nested dict that returns a value. You don't need to explicitly use an else clause because the if returns. You also don't need to call .keys():

def get_item(data, item_key):
    if isinstance(data, dict):
        if item_key in data:
            return data[item_key]

        for key in data:
            found = get_item(data[key], item_key)
            if found:
                return found
    return None  # Explicit vs Implicit

>>> get_item(data, 'aws:RequestId')
'4dbbf7ef-ae87-483b-5ff1-852c777be012'

One of the design principles of python is EAFP (Easier to Ask for Forgiveness than Permission), which means that exceptions are more commonly used than in other languages. The above rewritten with EAFP design:

def get_item(data, item_key):
    try:
        return data[item_key]
    except KeyError:
        for key in data:
            found = get_item(data[key], item_key)
            if found:
                return found
    except (TypeError, IndexError):
        pass
    return None
AChampion
  • 29,683
  • 4
  • 59
  • 75
1

As other people commented, you need return statement in else blocks, too. You have two if blocks so you would need two more return statement. Here is code that does what you may want

from collections import OrderedDict

def get_item(data,item_key):
    result = []
    if isinstance(data, dict):
        for key in data:
            if key == item_key:
                print data[item_key]
                result.append(data[item_key])
            # recursion
            result += get_item(data[key],item_key)
        return result
    return result
0

Your else block needs to return the value if it finds it.

I've made a few other minor changes to your code. You don't need to do

if item_key in data.keys():

Instead, you can simply do

if item_key in data:

Similarly, you don't need

for key in data.keys():

You can iterate directly over a dict (or any class derived from a dict) to iterate over its keys:

for key in data:

Here's my version of your code, which should run on Python 2.7 as well as Python 3.

from __future__ import print_function
from collections import OrderedDict

def get_item(data, item_key):
    if isinstance(data, dict):
        if item_key in data:
            return data[item_key]

        for val in data.values():
            v = get_item(val, item_key)
            if v is not None:
                return v

data = OrderedDict([(u'aws:UrlInfoResponse', 
    OrderedDict([(u'@xmlns:aws', u'http://alexa.amazonaws.com/doc/2005-10-05/'), (u'aws:Response', 
    OrderedDict([(u'@xmlns:aws', u'http://awis.amazonaws.com/doc/2005-07-11'), (u'aws:OperationRequest', 
    OrderedDict([(u'aws:RequestId', u'4dbbf7ef-ae87-483b-5ff1-852c777be012')])), (u'aws:UrlInfoResult', 
    OrderedDict([(u'aws:Alexa', 
    OrderedDict([(u'aws:TrafficData', 
    OrderedDict([(u'aws:DataUrl', 
    OrderedDict([(u'@type', u'canonical'), ('#text', u'infowars.com/')])), 
        (u'aws:Rank', u'1252')]))]))])), (u'aws:ResponseStatus', 
    OrderedDict([(u'@xmlns:aws', u'http://alexa.amazonaws.com/doc/2005-10-05/'), 
        (u'aws:StatusCode', u'Success')]))]))]))])

item = get_item(data, 'aws:RequestId')
print(item)

output

4dbbf7ef-ae87-483b-5ff1-852c777be012

Note that this function returns None if the isinstance(data, dict) test fails, or if the for loop fails to return. It's generally a good idea to ensure that every possible return path in a recursive function has an explicit return statement, as that makes it clearer what's happening, but IMHO it's ok to leave those returns implicit in this fairly simple function.

PM 2Ring
  • 54,345
  • 6
  • 82
  • 182