148

Is there are more readable way to check if a key buried in a dict exists without checking each level independently?

Lets say I need to get this value in a object buried (example taken from Wikidata):

x = s['mainsnak']['datavalue']['value']['numeric-id']

To make sure that this does not end with a runtime error it is necessary to either check every level like so:

if 'mainsnak' in s and 'datavalue' in s['mainsnak'] and 'value' in s['mainsnak']['datavalue'] and 'nurmeric-id' in s['mainsnak']['datavalue']['value']:
    x = s['mainsnak']['datavalue']['value']['numeric-id']

The other way I can think of to solve this is wrap this into a try catch construct which I feel is also rather awkward for such a simple task.

I am looking for something like:

x = exists(s['mainsnak']['datavalue']['value']['numeric-id'])

which returns True if all levels exists.

martineau
  • 119,623
  • 25
  • 170
  • 301
loomi
  • 2,936
  • 3
  • 25
  • 28

19 Answers19

219

To be brief, with Python you must trust it is easier to ask for forgiveness than permission

try:
    x = s['mainsnak']['datavalue']['value']['numeric-id']
except KeyError:
    pass

The answer

Here is how I deal with nested dict keys:

def keys_exists(element, *keys):
    '''
    Check if *keys (nested) exists in `element` (dict).
    '''
    if not isinstance(element, dict):
        raise AttributeError('keys_exists() expects dict as first argument.')
    if len(keys) == 0:
        raise AttributeError('keys_exists() expects at least two arguments, one given.')

    _element = element
    for key in keys:
        try:
            _element = _element[key]
        except KeyError:
            return False
    return True

Example:

data = {
    "spam": {
        "egg": {
            "bacon": "Well..",
            "sausages": "Spam egg sausages and spam",
            "spam": "does not have much spam in it"
        }
    }
}

print 'spam (exists): {}'.format(keys_exists(data, "spam"))
print 'spam > bacon (do not exists): {}'.format(keys_exists(data, "spam", "bacon"))
print 'spam > egg (exists): {}'.format(keys_exists(data, "spam", "egg"))
print 'spam > egg > bacon (exists): {}'.format(keys_exists(data, "spam", "egg", "bacon"))

Output:

spam (exists): True
spam > bacon (do not exists): False
spam > egg (exists): True
spam > egg > bacon (exists): True

It loop in given element testing each key in given order.

I prefere this to all variable.get('key', {}) methods I found because it follows EAFP.

Function except to be called like: keys_exists(dict_element_to_test, 'key_level_0', 'key_level_1', 'key_level_n', ..). At least two arguments are required, the element and one key, but you can add how many keys you want.

If you need to use kind of map, you can do something like:

expected_keys = ['spam', 'egg', 'bacon']
keys_exists(data, *expected_keys)
Arount
  • 9,853
  • 1
  • 30
  • 43
  • Yes, as mentioned this is a valid solution. But imagine a function which is accessing like 10 times such a variable, all the `try except` statements will leave quite a bloat. – loomi Apr 19 '17 at 09:15
  • @loomi You can make a small function this `try-except` logic and simply call this each time – Chris_Rands Apr 19 '17 at 09:17
  • @loomi wrap it in a function. – juanpa.arrivillaga Apr 19 '17 at 09:18
  • @Chris_Rands Ha! That seems to be tricky, as I can't actually pass the variable to the function ... exists(s['missing-key']) throws already an error! – loomi Apr 19 '17 at 09:29
  • @loomi I updated my answer, I think what you need is a way to check **nested** keys in a dict **dynamically**. If this is what you really wants I advise you to change title and maybe the content of your question to make it more accurate (for future users). – Arount Apr 19 '17 at 11:56
  • @loomi PS: You can not do stuff like `my_function(element['foo']['bar'])` because you are evaluating `element['foo']['bar']` before going into your function `my_function`. – Arount Apr 19 '17 at 12:04
  • 1
    "In two words, with Python you must trust it is easier to ask for forgiveness than permission" uses a lot more than two words. – user2357112 Jul 13 '17 at 23:40
  • Using exceptions as a means for normal execution path flow control seems to be a bad idea. – Kevin Postlewaite Jul 17 '19 at 20:41
  • @KevinPostlewaite really not, that's how Python is designed and I make it obvious at the very first line: "_easier to ask for forgiveness than permission_", that's the Python "normal" execution path – Arount Jul 28 '19 at 19:28
  • 3
    Great answer, but one thing should be changed: `if type(element) is not dict` to `if not isinstance(element, dict)`. This way it will work for types like OrderedDict as well. – Fonic Jul 29 '19 at 08:42
  • @Arount You're welcome, always good to make a great answer even better :) – Fonic Aug 06 '19 at 19:48
  • In my case, I needed to catch not only `KeyError` but also `TypeError`. @Aroust's example will throw unexpected TypeError with dict `{"mainsnak": {"datavalue": {"value": "a"}}}`. – Roeniss Aug 18 '22 at 20:20
  • A good example, but I would recommend that this function raise `TypeError` instead of `AttributeError` on unexpected arguments – spagh-eddie Feb 13 '23 at 21:23
30

You could use .get with defaults:

s.get('mainsnak', {}).get('datavalue', {}).get('value', {}).get('numeric-id')

but this is almost certainly less clear than using try/except.

Daniel Roseman
  • 588,541
  • 66
  • 880
  • 895
  • 1
    And whatever you give the last `get` as the default value, it could just happen to be the value of `s['mainsnak']['datavalue']['value']['numeric-id']`. – timgeb Apr 19 '17 at 09:16
  • 13
    I've been using this construct a lot and just got shot in foot by this. Be cautions when using example above, because if the "getted" element actually exists and is not dict (or object on which you can call `get`) (None is my case), this will end up with `'NoneType' object has no attribute 'get'` or whatever type you have there. – darkless Sep 11 '19 at 09:00
16

Python 3.8 +

dictionary = {
    "main_key": {
        "sub_key": "value",
    },
}

if sub_key_value := dictionary.get("main_key", {}).get("sub_key"):
    print(f"The key 'sub_key' exists in dictionary[main_key] and it's value is {sub_key_value}")
else:
    print("Key 'sub_key' doesn't exists or their value is Falsy")

Extra

A little but important clarification.

In the previous code block, we verify that a key exists in a dictionary but that its value is also Truthy. Most of the time, this is what people are really looking for, and I think this is what the OP really wants. However, it is not really the most "correct" answer, since if the key exists but its value is False, the above code block will tell us that the key does not exist, which is not true.

So, I leet here a more correct answer:

dictionary = {
    "main_key": {
        "sub_key": False,
    },
}

if "sub_key" in dictionary.get("main_key", {}):
    print(f"The key 'sub_key' exists in dictionary[main_key] and it's value is {dictionary['main_key']['sub_key']}")
else:
    print("Key 'sub_key' doesn't exists")
Lucas Vazquez
  • 1,456
  • 16
  • 20
  • SyntaxError: invalid syntax at if key_exists := dictionary.get("key_1", {}).get("key_2"): – aysh Aug 20 '20 at 01:43
  • @aysh It's Python **3.8** example – Lucas Vazquez Sep 03 '20 at 15:59
  • 1
    What if `dictionary['main_key']['sub_key'] == False`? You need to explicitly check against the sentinel returned by `get` when the key does not exist, not just assume that `None` is the only falsey value. – chepner Jul 28 '21 at 00:12
  • @chepner Yeah, that's a really good point. I modified my answer. – Lucas Vazquez Jul 28 '21 at 14:05
  • Is it possible to add types for the key value? for example: `if sub_key_value := dictionary.get("main_key", {}).get("sub_key") -> List[str]:` – alexwatever Sep 16 '22 at 07:09
  • Hello @alexwatever A few hours ago I added a comment and I just deleted it to comment on something else. I leave you a pastebin that has 3 different ways to add types to this sentence. Enjoy!!: https://pastebin.com/EFVQMyUu – Lucas Vazquez Sep 17 '22 at 02:29
14

Try/except seems to be most pythonic way to do that.
The following recursive function should work (returns None if one of the keys was not found in the dict):

def exists(obj, chain):
    _key = chain.pop(0)
    if _key in obj:
        return exists(obj[_key], chain) if chain else obj[_key]

myDict ={
    'mainsnak': {
        'datavalue': {
            'value': {
                'numeric-id': 1
            }
        }
    }
}

result = exists(myDict, ['mainsnak', 'datavalue', 'value', 'numeric-id'])
print(result)
>>> 1
Maurice Meyer
  • 17,279
  • 4
  • 30
  • 47
  • 1
    How would you do it for arrays, like if 'value' was an array of 'numeric-ids' result = exists(myDict, ['mainsnak', 'datavalue', 'value[0]', 'numeric-id']) ? – Dss Aug 13 '19 at 20:13
  • @Maurice Meyer : What if 'mainsnak2' , 'mainsnak3' and so on exists (like 'mainsnak', inner dictionary remains same). In that case, can we check if 'datavalue exists' in all 'mainsnak','mainsnak2' & 'mainsnak3' ? – StackGuru Jul 10 '20 at 15:48
  • 1
    doesn't work if `numeric-id` is `None`, we won't be sure if the value is `None` or key is missing. https://stackoverflow.com/a/43491315/86258 is better – srs May 14 '21 at 21:01
11

I suggest you to use python-benedict, a solid python dict subclass with full keypath support and many utility methods.

You just need to cast your existing dict:

s = benedict(s)

Now your dict has full keypath support and you can check if the key exists in the pythonic way, using the in operator:

if 'mainsnak.datavalue.value.numeric-id' in s:
    # do stuff

Here the library repository and the documentation: https://github.com/fabiocaccamo/python-benedict

Note: I am the author of this project

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Fabio Caccamo
  • 1,871
  • 19
  • 21
  • It's a great library but causes often name conflicts with BeneDict. I have to search for an alternative as it simply was unusable in my environment. – Andreas Oct 25 '21 at 21:35
  • This module is registered on pypi as `python-benedict`. Probably your IDE assumes that the name of the package to install matches the name of the module you are importing, but this is wrong. I suggest you to take full-control of what you do and install requirements manually :) – Fabio Caccamo Oct 26 '21 at 08:11
  • @FabioCaccamo Thanks for this. If possible, can you please name some advantages/disadvantages of your repository over `pydash` recommended by @Alexander here? (mostly in terms of performance/memory) – Michel Gokan Khan Feb 26 '22 at 10:27
  • @MichelGokanKhan frankly I don't know/use `pydash`, so I can't say, but if you try both let me know! – Fabio Caccamo Feb 28 '22 at 08:37
6

You can use pydash to check if exists: http://pydash.readthedocs.io/en/latest/api.html#pydash.objects.has

Or get the value (you can even set default - to return if doesn't exist): http://pydash.readthedocs.io/en/latest/api.html#pydash.objects.has

Here is an example:

>>> get({'a': {'b': {'c': [1, 2, 3, 4]}}}, 'a.b.c[1]')
2
Alexander
  • 7,484
  • 4
  • 51
  • 65
4

The try/except way is the most clean, no contest. However, it also counts as an exception in my IDE, which halts execution while debugging.

Furthermore, I do not like using exceptions as in-method control statements, which is essentially what is happening with the try/catch.

Here is a short solution which does not use recursion, and supports a default value:

def chained_dict_lookup(lookup_dict, keys, default=None):
    _current_level = lookup_dict
    for key in keys:
        if key in _current_level:
            _current_level = _current_level[key]
        else:
            return default
    return _current_level
Houen
  • 1,039
  • 1
  • 16
  • 35
  • I like this solution :) ... Just a note here. at some point `current_level[key]` can point to a value and not an inner dict. So anyone using this, take care to check that `current_level` is not string, or a float or something. – Jordan Simba Sep 25 '20 at 21:11
4

The accepted answer is a good one, but here is another approach. It's a little less typing and a little easier on the eyes (in my opinion) if you end up having to do this a lot. It also doesn't require any additional package dependencies like some of the other answers. Have not compared performance.

import functools

def haskey(d, path):
    try:
        functools.reduce(lambda x, y: x[y], path.split("."), d)
        return True
    except KeyError:
        return False

# Throwing in this approach for nested get for the heck of it...
def getkey(d, path, *default):
    try:
        return functools.reduce(lambda x, y: x[y], path.split("."), d)
    except KeyError:
        if default:
            return default[0]
        raise

Usage:

data = {
    "spam": {
        "egg": {
            "bacon": "Well..",
            "sausages": "Spam egg sausages and spam",
            "spam": "does not have much spam in it",
        }
    }
}

(Pdb) haskey(data, "spam")
True
(Pdb) haskey(data, "spamw")
False
(Pdb) haskey(data, "spam.egg")
True
(Pdb) haskey(data, "spam.egg3")
False
(Pdb) haskey(data, "spam.egg.bacon")
True

Original inspiration from the answers to this question.

EDIT: a comment pointed out that this only works with string keys. A more generic approach would be to accept an iterable path param:

def haskey(d, path):
    try:
        functools.reduce(lambda x, y: x[y], path, d)
        return True
    except KeyError:
        return False

(Pdb) haskey(data, ["spam", "egg"])
True
totalhack
  • 2,298
  • 17
  • 23
3

The selected answer works well on the happy path, but there are a couple obvious issues to me. If you were to search for ["spam", "egg", "bacon", "pizza"], it would throw a type error due to trying to index "well..." using the string "pizza". Like wise, if you replaced pizza with 2, it would use that to get the index 2 from "Well..."

Selected Answer Output Issues:

data = {
    "spam": {
        "egg": {
            "bacon": "Well..",
            "sausages": "Spam egg sausages and spam",
            "spam": "does not have much spam in it"
        }
    }
}

print(keys_exists(data, "spam", "egg", "bacon", "pizza"))
>> TypeError: string indices must be integers

print(keys_exists(data, "spam", "egg", "bacon", 2)))
>> l

I also feel that using try except can be a crutch that we might too quickly rely on. Since I believe we already need to check for the type, might as well remove the try except.

Solution:

def dict_value_or_default(element, keys=[], default=Undefined):
    '''
    Check if keys (nested) exists in `element` (dict).
    Returns value if last key exists, else returns default value
    '''
    if not isinstance(element, dict):
        return default

    _element = element
    for key in keys:
        # Necessary to ensure _element is not a different indexable type (list, string, etc).  
        # get() would have the same issue if that method name was implemented by a different object
        if not isinstance(_element, dict) or key not in _element:
            return default

        _element = _element[key]
        
    return _element 

Output:

print(dict_value_or_default(data, ["spam", "egg", "bacon", "pizza"]))
>> INVALID

print(dict_value_or_default(data, ["spam", "egg", "bacon", 2]))
>> INVALID

print(dict_value_or_default(data, ["spam", "egg", "bacon"]))
>> "Well..."
Matthew Pautzke
  • 548
  • 5
  • 13
2

I had the same problem and recent python lib popped up:
https://pypi.org/project/dictor/
https://github.com/perfecto25/dictor

So in your case:

from dictor import dictor

x = dictor(s, 'mainsnak.datavalue.value.numeric-id')

Personal note:
I don't like 'dictor' name, since it doesn't hint what it actually does. So I'm using it like:

from dictor import dictor as extract
x = extract(s, 'mainsnak.datavalue.value.numeric-id')

Couldn't come up with better naming than extract. Feel free to comment, if you come up with more viable naming. safe_get, robust_get didn't felt right for my case.

darkless
  • 1,304
  • 11
  • 19
2

Another way:

def does_nested_key_exists(dictionary, nested_key):
    exists = nested_key in dictionary
    if not exists:
        for key, value in dictionary.items():
            if isinstance(value, dict):
                exists = exists or does_nested_key_exists(value, nested_key)
    return exists
  • what is does_nested_key_exists(value, nested_key) here – aysh Aug 20 '20 at 03:08
  • @aysh, in case you're still curious, this is a recursuve call to `does_nested_key_exists`. Because `if not exists` evaluated to True for the parent dictionary, we want to check all child dictionaries (ie all `value`s in `dictionary` that are instances of dict), **so we start this function again**. Passing `value` as the first argument means this time the function will get the `value` sub-dictionary in its `dictionary` parameter. This continues down through all nested dictionaries so if `nesed_key` exists in any of them, the original call to `does_nested_key_exists` will eventually return True. – Cat Apr 12 '21 at 09:09
2

Here's my small snippet based on @Aroust's answer:

def exist(obj, *keys: str) -> bool:
    _obj = obj
    try:
        for key in keys:
            _obj = _obj[key]
    except (KeyError, TypeError):
        return False
    return True

if __name__ == '__main__':
    obj = {"mainsnak": {"datavalue": {"value": "A"}}}
    answer = exist(obj, "mainsnak", "datavalue", "value", "B")
    print(answer)

I added TypeError because when _obj is str, int, None, or etc, it would raise that error.

Roeniss
  • 386
  • 5
  • 16
1

I wrote a data parsing library called dataknead for cases like this, basically because i got frustrated by the JSON the Wikidata API returns as well.

With that library you could do something like this

from dataknead import Knead

numid = Knead(s).query("mainsnak/datavalue/value/numeric-id").data()

if numid:
    # Do something with `numeric-id`
Husky
  • 5,757
  • 2
  • 46
  • 41
1

Using dict with defaults is concise and appears to execute faster than using consecutive if statements.

Try it yourself:

import timeit

timeit.timeit("'x' in {'a': {'x': {'y'}}}.get('a', {})")
# 0.2874350370002503

timeit.timeit("'a' in {'a': {'x': {'y'}}} and 'x' in {'a': {'x': {'y'}}}['a']")
# 0.3466246419993695

Owen Brown
  • 179
  • 9
1

I have written a handy library for this purpose.

I am iterating over ast of the dict and trying to check if a particular key is present or not.

Do check this out. https://github.com/Agent-Hellboy/trace-dkey

1

A bit ugly, but the simplest way to achieve this in a one-liner

d = {
     'mainsnak': {
             'datavalue': {
                     'value': {
                             'numeric-id': {
                              }
                      }
              }
     }
}

d.get('mainsnak',{}).get('datavalue',{}).get('value',{}).get('numeric-id')

jord
  • 41
  • 5
0

If you can suffer testing a string representation of the object path then this approach might work for you:

def exists(str):
    try:
        eval(str)
        return True
    except:
        return False

exists("lst['sublist']['item']")
geotheory
  • 22,624
  • 29
  • 119
  • 196
0

one can try to use this for checking whether key/nestedkey/value is in nested dict

import yaml

#d - nested dictionary
if something in yaml.dump(d, default_flow_style=False):
    print(something, "is in", d)
else:
    print(something, "is not in", d)
Qas
  • 332
  • 4
  • 12
-1

There are many great answers. here is my humble take on it. Added check for array of dictionaries as well. Please note that I am not checking for arguments validity. I used part Arnot's code above. I added this answer because a I got a use case that requires checking array or dictionaries in my data. Here is the code:

def keys_exists(element, *keys):
    '''
    Check if *keys (nested) exists in `element` (dict).
    '''
    
    retval=False
    if isinstance(element,dict):
        for key,value in element.items():
            for akey in keys:
                if element.get(akey) is not None:
                    return True
            if isinstance(value,dict) or isinstance(value,list):
                retval= keys_exists(value, *keys)
            
    elif isinstance(element, list):
        for val in element:
            if isinstance(val,dict) or isinstance(val,list):
                retval=keys_exists(val, *keys)

    return retval