249

Is there a built-in/quick way to use a list of keys to a dictionary to get a list of corresponding items?

For instance I have:

>>> mydict = {'one': 1, 'two': 2, 'three': 3}
>>> mykeys = ['three', 'one']

How can I use mykeys to get the corresponding values in the dictionary as a list?

>>> mydict.WHAT_GOES_HERE(mykeys)
[3, 1]
martineau
  • 119,623
  • 25
  • 170
  • 301
FazJaxton
  • 7,544
  • 6
  • 26
  • 32

13 Answers13

280

A list comprehension seems to be a good way to do this:

>>> [mydict[x] for x in mykeys]
[3, 1]
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
FazJaxton
  • 7,544
  • 6
  • 26
  • 32
126

A couple of other ways than list-comp:

  • Build list and throw exception if key not found: map(mydict.__getitem__, mykeys)
  • Build list with None if key not found: map(mydict.get, mykeys)

Alternatively, using operator.itemgetter can return a tuple:

from operator import itemgetter
myvalues = itemgetter(*mykeys)(mydict)
# use `list(...)` if list is required

Note: in Python3, map returns an iterator rather than a list. Use list(map(...)) for a list.

jpp
  • 159,742
  • 34
  • 281
  • 339
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • [Don't call `mydict.__getitem__()` directly](/q/28084799/4518341), instead use a generator expression: `(mydict[key] for key in mykeys)`. Or for `list(map(...))`, a list comprehension: `[mydict[key] for key in mykeys]`. – wjandrea Nov 20 '21 at 21:30
69

A little speed comparison:

Python 2.7.11 |Anaconda 2.4.1 (64-bit)| (default, Dec  7 2015, 14:10:42) [MSC v.1500 64 bit (AMD64)] on win32
In[1]: l = [0,1,2,3,2,3,1,2,0]
In[2]: m = {0:10, 1:11, 2:12, 3:13}
In[3]: %timeit [m[_] for _ in l]  # list comprehension
1000000 loops, best of 3: 762 ns per loop
In[4]: %timeit map(lambda _: m[_], l)  # using 'map'
1000000 loops, best of 3: 1.66 µs per loop
In[5]: %timeit list(m[_] for _ in l)  # a generator expression passed to a list constructor.
1000000 loops, best of 3: 1.65 µs per loop
In[6]: %timeit map(m.__getitem__, l)
The slowest run took 4.01 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 853 ns per loop
In[7]: %timeit map(m.get, l)
1000000 loops, best of 3: 908 ns per loop
In[33]: from operator import itemgetter
In[34]: %timeit list(itemgetter(*l)(m))
The slowest run took 9.26 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 739 ns per loop

So list comprehension and itemgetter are the fastest ways to do this.

Update

For large random lists and maps I had a bit different results:

Python 2.7.11 |Anaconda 2.4.1 (64-bit)| (default, Dec  7 2015, 14:10:42) [MSC v.1500 64 bit (AMD64)] on win32
In[2]: import numpy.random as nprnd
l = nprnd.randint(1000, size=10000)
m = dict([(_, nprnd.rand()) for _ in range(1000)])
from operator import itemgetter
import operator
f = operator.itemgetter(*l)

%timeit f(m)
1000 loops, best of 3: 1.14 ms per loop

%timeit list(itemgetter(*l)(m))
1000 loops, best of 3: 1.68 ms per loop

%timeit [m[_] for _ in l]  # list comprehension
100 loops, best of 3: 2 ms per loop

%timeit map(m.__getitem__, l)
100 loops, best of 3: 2.05 ms per loop

%timeit list(m[_] for _ in l)  # a generator expression passed to a list constructor.
100 loops, best of 3: 2.19 ms per loop

%timeit map(m.get, l)
100 loops, best of 3: 2.53 ms per loop

%timeit map(lambda _: m[_], l)
100 loops, best of 3: 2.9 ms per loop

So in this case the clear winner is f = operator.itemgetter(*l); f(m), and clear outsider: map(lambda _: m[_], l) .

Update for Python 3.6.4

import numpy.random as nprnd
l = nprnd.randint(1000, size=10000)
m = dict([(_, nprnd.rand()) for _ in range(1000)])
from operator import itemgetter
import operator
f = operator.itemgetter(*l)

%timeit f(m)
1.66 ms ± 74.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit list(itemgetter(*l)(m))
2.1 ms ± 93.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit [m[_] for _ in l]  # list comprehension
2.58 ms ± 88.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit list(map(m.__getitem__, l))
2.36 ms ± 60.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit list(m[_] for _ in l)  # a generator expression passed to a list constructor.
2.98 ms ± 142 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit list(map(m.get, l))
2.7 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit list(map(lambda _: m[_], l)
3.14 ms ± 62.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

So, results for Python 3.6.4 is almost the same.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
Sklavit
  • 2,225
  • 23
  • 29
18

Here are three ways.

Raising KeyError when key is not found:

result = [mapping[k] for k in iterable]

Default values for missing keys.

result = [mapping.get(k, default_value) for k in iterable]

Skipping missing keys.

result = [mapping[k] for k in iterable if k in mapping]
OdraEncoded
  • 3,064
  • 3
  • 20
  • 31
  • `found_keys = mapping.keys() & iterable` gives `TypeError: unsupported operand type(s) for &: 'list' and 'list'` on python 2.7; `found_keys = [key for key in mapping.keys() if key in iterable] works best – NotGaeL Jul 18 '18 at 08:58
10

Try This:

mydict = {'one': 1, 'two': 2, 'three': 3}
mykeys = ['three', 'one','ten']
newList=[mydict[k] for k in mykeys if k in mydict]
print newList
[3, 1]
Vikram Singh Chandel
  • 1,290
  • 2
  • 17
  • 36
  • 1
    The `"if k in mydict"` part makes it a bit too permissive - would fail silently if the list is wider, but correct, than keys in the dict (narrower, but incorrect). – mirekphd Dec 02 '20 at 09:55
8

Try this:

mydict = {'one': 1, 'two': 2, 'three': 3}
mykeys = ['three', 'one'] # if there are many keys, use a set

[mydict[k] for k in mykeys]
=> [3, 1]
Óscar López
  • 232,561
  • 37
  • 312
  • 386
  • @PeterDeGlopper you're confused. `items()` is preferred, it doesn't have to make an additional lookup, there's no `len(mydict)*len(mykeys)` operation here! (notice that I'm using a set) – Óscar López Aug 26 '13 at 21:53
  • @ÓscarLópez Yes there is, you're inspecting every element of the dictionary. iteritems doesn't yield them until you need them, so it avoids constructing an intermediary list, but you still run 'k in mykeys' (order len(mykeys), since it's a list) for every k in mydict. Completely unnecessarily, compared to the simpler list comprehension that just runs over mykeys. – Peter DeGlopper Aug 26 '13 at 21:57
  • @inspectorG4dget @PeterDeGlopper the membership operation over `mykeys` is amortized constant time, I'm using a set, not a list – Óscar López Aug 26 '13 at 21:58
  • 2
    Converting the OP's list to a set at least makes it linear, but it's still linear on the wrong data structure as well as losing order. Consider the case of a 10k dictionary and 2 keys in mykeys. Your solution makes 10k set membership tests, compared to two dictionary lookups for the simple list comprehension. In general it seems safe to assume that the number of keys will be smaller than the number of dictionary elements - and if it's not, your approach will omit repeated elements. – Peter DeGlopper Aug 26 '13 at 22:00
6
new_dict = {x: v for x, v in mydict.items() if x in mykeys}
Pavel Minenkov
  • 368
  • 2
  • 9
0

Pandas does this very elegantly, though ofc list comprehensions will always be more technically Pythonic. I don't have time to put in a speed comparison right now (I'll come back later and put it in):

import pandas as pd
mydict = {'one': 1, 'two': 2, 'three': 3}
mykeys = ['three', 'one']
temp_df = pd.DataFrame().append(mydict)
# You can export DataFrames to a number of formats, using a list here. 
temp_df[mykeys].values[0]
# Returns: array([ 3.,  1.])

# If you want a dict then use this instead:
# temp_df[mykeys].to_dict(orient='records')[0]
# Returns: {'one': 1.0, 'three': 3.0}
abby sobh
  • 1,574
  • 19
  • 15
0

If you want to make sure that the keys exist in your dictionary before you try to access them, you can use the set type. Convert the dictionary keys and your requested keys to sets. Then use the issubset() method.

mydict = {'one': 1, 'two': 2, 'three': 3}
mykeys = ['three', 'one']
assert set(mykeys).issubset(set(mydict.keys()))
result = [mydict[key] for key in mykeys]
swimfar2
  • 103
  • 8
0

A couple new answers.

(1) if your dict-like object needs to be constructed, I find it useful to do it this way, so that it is constructed just once:

train_data, train_labels, test_data, test_labels = [
    d[k] for d in [numpy.load('classify.npz')]
    for k in 'train_data train_labels test_data test_labels'.split()]

(2) python 3.10 has dict pattern-matching, which allows the following sort of pattern.

match numpy.load('classify.npz'):
    case {
        'train_data': train_data,
        'train_labels': train_labels,
        'test_data': test_data,
        'test_labels': test_labels}: pass
    case _: raise KeyError()
David Bau
  • 3,681
  • 2
  • 18
  • 13
-1

Following closure of Python: efficient way to create a list from dict values with a given order

Retrieving the keys without building the list:

from __future__ import (absolute_import, division, print_function,
                        unicode_literals)

import collections


class DictListProxy(collections.Sequence):
    def __init__(self, klist, kdict, *args, **kwargs):
        super(DictListProxy, self).__init__(*args, **kwargs)
        self.klist = klist
        self.kdict = kdict

    def __len__(self):
        return len(self.klist)

    def __getitem__(self, key):
        return self.kdict[self.klist[key]]


myDict = {'age': 'value1', 'size': 'value2', 'weigth': 'value3'}
order_list = ['age', 'weigth', 'size']

dlp = DictListProxy(order_list, myDict)

print(','.join(dlp))
print()
print(dlp[1])

The output:

value1,value3,value2

value3

Which matches the order given by the list

Community
  • 1
  • 1
mementum
  • 3,153
  • 13
  • 20
-2
reduce(lambda x,y: mydict.get(y) and x.append(mydict[y]) or x, mykeys,[])

incase there are keys not in dict.

Paul Rooney
  • 20,879
  • 9
  • 40
  • 61
yupbank
  • 263
  • 1
  • 13
-2

If you found yourself doing this a lot, you might want to subclass dict to take a list of keys and return a list of values.

>>> d = MyDict(mydict)
>>> d[mykeys]
[3, 1]

Here's a demo implementation.

class MyDict(dict):
    def __getitem__(self, key):
        getitem = super().__getitem__
        if isinstance(key, list):
            return [getitem(x) for x in key]
        else:
            return getitem(key)

Subclassing dict well requires some more work, plus you'd probably want to implement .get(), .__setitem__(), and .__delitem__(), among others.

wjandrea
  • 28,235
  • 9
  • 60
  • 81