Find Dict in List of Dicts Based on Incomplete Information About Wanted Dict

Question

Assume a list of dicts, e.g., the following:

a = {'key': 5705435, 'key2': 1, 'data': 'data'}
b = {'key': 2345435, 'key2': 1, 'data': 'data'}
c = {'key': 9155435, 'key2': 2, 'data': 'data'}
data = [a,b,c]

I want to get a dict from that list that matches a given key (e.g. return dict with key == 2345435 or return dict where key == 9155435 and key2 == 2). Obviously I can iterate through the list and compare the key attributes like below.

def get_dict_by_key(data): 
    for el in data:
        if el['key'] == 2345435:
            return el
    return None

Is there another method to do this (without explicitly iterating through the list)? Are there some predefined methods, like for example list.index(element), where I can pass into incomplete information about the dict I'm looking for?

there is no secret ingredient.. :) https://stackoverflow.com/questions/8653516/python-list-of-dictionaries-search — iamklaus, Nov 22 '18 at 12:34
@SarthakNegi, There is an (IMO, critical) difference between that target question and this one, which is that this one has a (potentially unique) `key`, which could enable restructuring data / Pandas to perform many queries in O(1) time. — jpp, Nov 22 '18 at 12:43

score 1 · Accepted Answer · answered Nov 22 '18 at 12:35

No, there are no pre-defined methods with a list of dictionaries: you have to iterate manually. You can do so via next and a generator expression:

mydict = next((x for x in data if x['key'] == 2345435), None)

But your function is sufficient, and will likely be more efficient. Note return None is redundant, this is assumed if return is not met.

The O(1) time indexing capabilities you are looking for are possible by restructuring your data, e.g. a dictionary or dictionaries indexed by key, or by using a custom class.

If you can use a 3rd party library, Pandas accepts a list of dictionaries directly:

import pandas as pd

df = pd.DataFrame(data).set_index('key')

This gives:

print(df)

         data  key2
key                
5705435  data     1
2345435  data     1
9155435  data     2

Accessing an index label gives a series mapping:

print(df.loc[2345435])

data    data
key2       1
Name: 2345435, dtype: object

score 1 · Answer 2 · answered Nov 22 '18 at 12:35

You could write a generator expression.

>>> next((d for d  in data if d.get('key') == 2345435), None)
{'data': 'data', 'key': 2345435, 'key2': 1}

There is no special filtering function in the standard library that does this job which I am aware of. At some level there must be a loop, because you want to do an operation for (potentially) every item in the list.

score 1 · Answer 3 · answered Nov 22 '18 at 12:49

You can the builtin all with a nested list comprehension. Store your desired keys in another dictionary:

a = {'key': 5705435, 'key2': 1, 'data': 'data'}
b = {'key': 2345435, 'key2': 1, 'data': 'data'}
c = {'key': 9155435, 'key2': 2, 'data': 'data'}
data = [a,b,c]

criteria = { 'key' : 9155435, 'key2' : 2} 

result = [x for x in data if all([value == x[key] for key,value in criteria.items()])]

This will return just dictionary c whereas the following:

 # returns a & b 
criteria = { 'key2' : 1} 
result = [x for x in data if all([value == x[key] for key,value in criteria.items()])]

# returns a, b & c
criteria = { 'data: 'data' } 
result = [x for x in data if all([value == x[key] for key,value in criteria.items()])]

Find Dict in List of Dicts Based on Incomplete Information About Wanted Dict

3 Answers3