How to avoid KeyError from missing dict keys?

Question

I'm running a script that makes GET requests in a loop. I test for a successful response, then convert the response to a json object and do stuff. My code looks like this:

response = requests.get(url=url)
if response.status_code == 200:
    response_json = response.json()
    <do stuff>
else:
    <handle error>

Sometimes I get a successful response, but for whatever reason the dict comes back with some missing data and my script breaks from a KeyError. I want to make a bulletproof test to avoid errors and retrieve the value of 'high' and 'low' for response_json['Data']['Data'][1]. The structure of the dict is:

{'Data': {'Aggregated': False,
          'Data': [{'close': 0.8062,
                    'conversionType': 'multiply',
                    'high': 0.8084,
                    'low': 0.788,
                    'open': 0.8102,
                    'time': 1428465600,
                    'volumefrom': 145.38,
                    'volumeto': 117.2},
                   {'close': 0.8,
                    'conversionType': 'multiply',
                    'high': 0.8101,
                    'low': 0.8,
                    'open': 0.8062,
                    'time': 1428469200,
                    'volumefrom': 262.39,
                    'volumeto': 209.92}],
          'TimeFrom': 1428465600,
          'TimeTo': 1428469200},
 'HasWarning': False,
 'Message': '',
 'RateLimit': {},
 'Response': 'Success',
 'Type': 100}

And my current best attempt at a test is:

if 'Data' in response_json:
    if 'Data' in 'Data':
        if len(response_json['Data']['Data']) > 1:
            if 'low' and 'high' in response_json['Data']['Data'][1]:
                if 'low' != None and 'high' != None:
                    <do stuff>    
else:
    print("Error:", url)

I believe this covers all bases, but it seems unwieldy. Is there a better way to test for the presence of keys and valid values in this situation, and/or to keep my script running without breaking?

Also wondering if I need an else statement after every nested conditional test, or if Python will default to the else at the bottom if one of the conditions comes back False?

`if 'low' and 'high' in response_json['Data']['Data'][1]` this is also always true - for a different reason: https://stackoverflow.com/questions/6159313/can-python-test-the-membership-of-multiple-values-in-a-list — rdas, Oct 13 '20 at 17:39
How can you be certain it's always true for if `'Data' in 'Data':`? — McQuestion, Oct 13 '20 at 17:39
`if 'low' != None and 'high' != None:` this is also always true - obviously — rdas, Oct 13 '20 at 17:39
Also, `if 'low' and 'high' in response_json['Data']['Data'][1]:` is not testing the presence of `'low'` at all, for roughly the same reasons it doesn't work [here](https://stackoverflow.com/q/15112125/364696). — ShadowRanger, Oct 13 '20 at 17:40
The string `'Data'` is always a substring of the string `'Data'` which is what you're checking in that statement — rdas, Oct 13 '20 at 17:40
I don't see how it's 'always true' that `if 'low' != None and 'high' != None:` ... couldn't one of these values come back `None`, if there's some bug in the API? Same for `'Data' in 'Data'`, could it not come back `None` for the same reason? — McQuestion, Oct 13 '20 at 17:43
@ShadowRanger I need both `'low'` and `'high'`, if either is missing my subsequent code won't work. I understand that if one isn't there, they likely both won't be there, however I still don't understand how I can be certain of that, I can envision a situation where the API did not record one of those values and either left the key out, or recorded the value as `None`...is this not possible? Remember I said a "bulletproof'" test. — McQuestion, Oct 13 '20 at 17:53
@McQuestion: Yes, I get that. But `'low' and 'high' in response_json['Data']['Data'][1]` is equivalent to asking if `'low'` is truthy and `'high'` is in `response_json['Data']['Data'][1]`. It's not checking if `'low'` is in anything, just whether it's truthy (which it is, not being the empty string). — ShadowRanger, Oct 13 '20 at 17:58
@rdas: `if 'low' and 'high' in response_json['Data']['Data'][1]` isn't always true; the `'low'` part of the test is always true, but the second part is in fact checked (because they used `and`, not `or`). — ShadowRanger, Oct 13 '20 at 17:59

Baltasarq · Answer 1 · 2020-10-15T07:41:22.693

3

Dictionaries support both the [] operator and the get(k) method. As you already know, the [] operator throws KeyError when the key k is not found, while get(k) will just return None.

d = {'a':1, 'b': 2, 'c':3}
print(str.format("c: '{}'", d.get('c')))    # c: '3'
print(str.format("d: '{}'", d.get('d')))    # d: 'None'

As @sleepyhead also comments, you can also provide a default return value with key, such as:

print(str.format("d: '{}'", d.get('d', -1))) # d: '-1'

Sure, this is only helpful if you can provide a default return value which does not pertain to the domain of valid return values.

edited Oct 15 '20 at 07:41

answered Oct 13 '20 at 17:40

Baltasarq

12,014
3
38
57

1

also you can provide a default value got get, like so `d.get('d', 'something')` – sleepyhead Oct 13 '20 at 21:18

score 1 · Answer 2 · answered Oct 13 '20 at 17:46

You can use try-except block to make sure the script keeps running even when it encounters an error.

You could for example structure it like this:

response = requests.get(url=url)
if response.status_code == 200:
    response_json = response.json()
    try:
        <stuff>
    except KeyError:
        <handle error> #or you can just pass the faulty data
else:
    <handle error>

score 1 · Accepted Answer · answered Oct 13 '20 at 18:04

I could not find any XPath type way to find a key in json so the stacked if statements are needed.

You need to fix your syntax to get correct values.

Try this code:

response_json = {'Data': {'Aggregated': False,
          'Data': [{'close': 0.8062,
                    'conversionType': 'multiply',
                    'high': 0.8084,
                    'low': 0.788,
                    'open': 0.8102,
                    'time': 1428465600,
                    'volumefrom': 145.38,
                    'volumeto': 117.2},
                   {'close': 0.8,
                    'conversionType': 'multiply',
                    'high': 0.8101,
                    'low': 0.8,
                    'open': 0.8062,
                    'time': 1428469200,
                    'volumefrom': 262.39,
                    'volumeto': 209.92}],
          'TimeFrom': 1428465600,
          'TimeTo': 1428469200},
 'HasWarning': False,
 'Message': '',
 'RateLimit': {},
 'Response': 'Success',
 'Type': 100}


low = high = None  # default values
if 'Data' in response_json.keys():
    if 'Data' in response_json['Data'].keys():
        if len(response_json['Data']['Data']) > 1:
            if 'low' in response_json['Data']['Data'][1]:
               if 'high' in response_json['Data']['Data'][1]:
                  if response_json['Data']['Data'][1]['low'] and response_json['Data']['Data'][1]['high']:
                        low  = response_json['Data']['Data'][1]['low']
                        high = response_json['Data']['Data'][1]['high']

if low and high:   # actually only need to check one                      
   print('<do stuff>', 'low', response_json['Data']['Data'][1]['low'])
   print('<do stuff>', 'high', response_json['Data']['Data'][1]['high'])
else:
   print("High\Low not found")

How to avoid KeyError from missing dict keys?

3 Answers3