Extract all keys from a list of dictionaries

Question

I'm trying to get a list of all keys in a list of dictionaries in order to fill out the fieldnames argument for csv.DictWriter.

previously, I had something like this:

[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]

and I was using fieldnames = list[0].keys() to take the first dictionary in the list and extract its keys.

Now I have something like this where one of the dictionaries has more key:value pairs than the others (could be any of the results). The new keys are added dynamically based on information coming from an API so they may or may not occur in each dictionary and I don't know in advance how many new keys there will be.

[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5, "height":4},
{"name": "Pam", "age": 7}
]

I can't just use fieldnames = list[1].keys() since it isn't necessarily the second element that will have extra keys.

A simple solution would be to find the dictionary with the greatest number of keys and use it for the fieldnames, but that won't work if you have an example like this:

[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5, "height":4},
{"name": "Pam", "age": 7, "weight":90}
]

where both the second and third dictionary have 3 keys but the end result should really be the list ["name", "age", "height", "weight"]

Hugh Bothwell · Accepted Answer · 2012-07-09T16:48:19.920

98

all_keys = set().union(*(d.keys() for d in mylist))

Edit: have to unpack the list. Now fixed.

edited Jul 09 '12 at 16:48

answered Jul 09 '12 at 16:38

Hugh Bothwell

55,315
8
84
99

1

This solution works perfectly, but it seems to produce a list of keys that have a different order than the list of dictionaries they were extracted from. Any idea how to keep the indexing? Thank you! – Momchill Mar 18 '22 at 16:31
@Momchill order is not guaranteed because he is using a set. I will post a snippet below for you that uses a list. – mareoraft Jan 11 '23 at 17:46

dawg · Answer 2 · 2012-07-09T18:04:08.203

Your data:

>>> LoD
[{'age': 10, 'name': 'Tom'}, 
 {'age': 5, 'name': 'Mark', 'height': 4}, 
 {'age': 7, 'name': 'Pam', 'weight': 90}]

This set comprehension will do it:

>>> {k for d in LoD for k in d.keys()}
{'age', 'name', 'weight', 'height'}

It works this way. First, create a list of lists of the dict keys:

>>> [list(d.keys()) for d in LoD]
[['age', 'name'], ['age', 'name', 'height'], ['age', 'name', 'weight']]

Then create a flattened version of this list of lists:

>>> [i for s in [d.keys() for d in LoD] for i in s]
['age', 'name', 'age', 'name', 'height', 'age', 'name', 'weight']

And create a set to eliminate duplicates:

>>> set([i for s in [d.keys() for d in LoD] for i in s])
{'age', 'name', 'weight', 'height'}

Which can be simplified to:

{k for d in LoD for k in d.keys()}

score 5 · Answer 3 · answered Nov 02 '18 at 03:09

from itertools import chain

lis = [
    {"name": "Tom", "age": 10},
    {"name": "Mark", "age": 5, "height":4},
    {"name": "Pam", "age": 7, "weight":90}
]

# without qualification a dict iterates over its keys
# and set takes any iterable in its constructor
headers_as_set = set(chain.from_iterable(lis))

# you asked for a list
headers = list(
    set(chain.from_iterable(lis))
)

score 4 · Answer 4 · answered Jul 09 '12 at 16:45

4

>>> lis=[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5, "height":4},
{"name": "Pam", "age": 7, "weight":90}
]
>>> {z for y in (x.keys() for x in lis) for z in y}
set(['age', 'name', 'weight', 'height'])

answered Jul 09 '12 at 16:45

Ashwini Chaudhary

244,495
58
464
504

score 3 · Answer 5 · edited Jul 09 '12 at 17:38

Borrowing lis from @AshwiniChaudhary's answer, here is an explanation of how you could solve your problem.

>>> lis=[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5, "height":4},
{"name": "Pam", "age": 7, "weight":90}
]

Iterating directly over a dict returns its keys, so you don't have to call keys() to get them back, saving a function call and a list construction per element in your list.

>>> {k for d in lis for k in d}
set(['age', 'name', 'weight', 'height'])

or use itertools.chain:

>>> from itertools import chain
>>> {k for k in chain(*lis)}
set(['age', 'name', 'weight', 'height'])

score 2 · Answer 6 · edited Jul 09 '12 at 17:36

2

The following example will extract the keys:

set_ = set()
for dict_ in dictionaries:
    set_.update(dict_.keys())
print set_

edited Jul 09 '12 at 17:36

octopusgrabbus

10,555
15
68
131

answered Jul 09 '12 at 16:41

user1277476

2,871
12
10

score 0 · Answer 7 · answered Jan 11 '23 at 17:52

If order matters to you, read on...

Input your data:

>>> list_of_dicts = [{'age': 10, 'name': 'Tom'},{'age': 5, 'name': 'Mark', 'height': 4}, {'age': 7, 'name': 'Pam', 'weight': 90}]

Define your function:

>>> def get_all_keys_in_order(list_of_dicts):
        ordered_keys = []
        for dict_ in list_of_dicts:
            for key in dict_:
                if key not in ordered_keys:
                    ordered_keys.append(key)
        return ordered_keys

Run your function to get output:

>>> get_all_keys_in_order(list_of_dicts)
['age', 'name', 'height', 'weight']

@Momchill I think this solves your problem. Please note that this algorithm is slower than the set solution which could be a problem if you are working with big data. But for small data there is no problem. — mareoraft, Jan 11 '23 at 17:55

Extract all keys from a list of dictionaries

7 Answers7

Linked

Related