54

I have a Python list, say a = [0,1,2,3,4,5,6]. I also have a list of indices, say b = [0,2,4,5]. How can I get the list of elements of a with indices in b?

a06e
  • 18,594
  • 33
  • 93
  • 169

8 Answers8

70

You can use list comprehension to get that list:

c = [a[index] for index in b]
print c

This is equivalent to:

c= []
for index in b:
    c.append(a[index])
print c

Output:

[0,2,4,5]

Note:

Remember that some_list[index] is the notation used to access to an element of a list in a specific index.

Christian Tapia
  • 33,620
  • 7
  • 56
  • 73
34

Something different...

>>> a = range(7)
>>> b = [0,2,4,5]
>>> import operator
>>> operator.itemgetter(*b)(a)
(0, 2, 4, 5)

The itemgetter function takes one or more keys as arguments, and returns a function which will return the items at the given keys in its argument. So in the above, we create a function which will return the items at index 0, index 2, index 4, and index 5, then apply that function to a.

It appears to be quite a bit faster than the equivalent list comprehension

In [1]: import operator

In [2]: a = range(7)

In [3]: b = [0,2,4,5]

In [4]: %timeit operator.itemgetter(*b)(a)
1000000 loops, best of 3: 388 ns per loop

In [5]: %timeit [ a[i] for i in b ]
1000000 loops, best of 3: 415 ns per loop

In [6]: f = operator.itemgetter(*b)

In [7]: %timeit f(a)
10000000 loops, best of 3: 183 ns per loop

As for why itemgetter is faster, the comprehension has to execute extra Python byte codes.

In [3]: def f(a,b): return [a[i] for i in b]

In [4]: def g(a,b): return operator.itemgetter(*b)(a)

In [5]: dis.dis(f)
  1           0 BUILD_LIST               0
              3 LOAD_FAST                1 (b)
              6 GET_ITER
        >>    7 FOR_ITER                16 (to 26)
             10 STORE_FAST               2 (i)
             13 LOAD_FAST                0 (a)
             16 LOAD_FAST                2 (i)
             19 BINARY_SUBSCR
             20 LIST_APPEND              2
             23 JUMP_ABSOLUTE            7
        >>   26 RETURN_VALUE

While itemgetter is a single call implemented in C:

In [6]: dis.dis(g)
  1           0 LOAD_GLOBAL              0 (operator)
              3 LOAD_ATTR                1 (itemgetter)
              6 LOAD_FAST                1 (b)
              9 CALL_FUNCTION_VAR        0
             12 LOAD_FAST                0 (a)
             15 CALL_FUNCTION            1
             18 RETURN_VALUE
chepner
  • 497,756
  • 71
  • 530
  • 681
  • 1
    Likely the fastest solution, too. – martineau Mar 14 '14 at 19:27
  • I hadn't thought of that. It does appear to be quite a bit faster; I'll post the test I did in `ipython`. – chepner Mar 14 '14 at 19:37
  • It's also fairly generic in the sense that it could be used to extract a sequence of values from a dictionary given the keys (which is what I've done most often with it). – martineau Mar 14 '14 at 19:43
  • Slices, even: `itemgetter(slice(2,5))(a)` -> `[2, 3, 4]`. I've added a link to the (2.x) documentation for the function. – chepner Mar 14 '14 at 19:48
  • *Why* is this faster than the more pythonic (IMHO) list comprehension? – a06e Apr 19 '16 at 11:54
  • 1
    The list comprehension has more overhead, as the iteration is set up and executed in Python. `operator.itemgetter` does its work in C. – chepner Apr 19 '16 at 13:36
  • This looks like a nice solution, but unfortunately (due to an inconsistency in `itemgetter`) breaks if the indexing list (here `b`) contains a single element. – Bubaya Nov 04 '21 at 11:45
  • It's not really an inconsistency with `itemgetter`; it's perfectly consistent with how `__getitem__` typically works with `slice` objects. The real problem is expecting a sequence-based solution to work with a "scalar" index. Use `itemgetter(*[b])` if `b` is a single item: convert the single item to a singleton sequence first. – chepner Nov 04 '21 at 11:48
10

If you are a fan of functional programming, you could use map and list.__getitem__:

>>> a = [0,1,2,3,4,5,6]
>>> b = [0,2,4,5]
>>> map(a.__getitem__, b)
[0, 2, 4, 5]
>>>

The list comprehension approach is more canonical in Python though...

5

Many of the proposed solutions will produce a KeyError if b contains an index not present in a. The following will skip invalid indices if that is desired.

>>> b = [0,2,4,5]
>>> a = [0,1,2,3,4,5,6]
>>> [x for i,x in enumerate(a) if i in b]
[0, 2, 4, 5]
>>> b = [0,2,4,500]
>>> [x for i,x in enumerate(a) if i in b]
[0, 2, 4]

enumerate produces tuples of index,value pairs. Since we have both the item and its index, we can check for the presence of the index in b

Brian
  • 3,091
  • 16
  • 29
4

Using List Comprehension ,this should work -

li = [a[i] for i in b]

Testing this -

>>> a = [0,10,20,30,40,50,60]
>>> b = [0,2,4,5]
>>> li = [a[i] for i in b]
>>> li
[0, 20, 40, 50]
Kamehameha
  • 5,423
  • 1
  • 23
  • 28
4

A bit of speed comparison for all mentioned methods and others from Python dictionary: Get list of values for list of keys:

Python 2.7.11 |Anaconda 2.4.1 (64-bit)| (default, Jan 19 2016, 12:08:31) [MSC v.1500 64 bit (AMD64)] on win32

In[2]: import numpy.random as nprnd
idx = nprnd.randint(1000, size=10000)
l = nprnd.rand(1000).tolist()
from operator import itemgetter
import operator
f = operator.itemgetter(*idx)
%timeit f(l)
%timeit list(itemgetter(*idx)(l))
%timeit [l[_] for _ in idx]  # list comprehension
%timeit map(l.__getitem__, idx)
%timeit list(l[_] for _ in idx)  # a generator expression passed to a list constructor.
%timeit map(lambda _: l[_], idx)  # using 'map'
%timeit [x for i, x in enumerate(l) if i in idx]
%timeit filter(lambda x: l.index(x) in idx, l)  # UPDATE @Kundor: work only for list with unique elements
10000 loops, best of 3: 175 µs per loop
1000 loops, best of 3: 707 µs per loop
1000 loops, best of 3: 978 µs per loop
1000 loops, best of 3: 1.03 ms per loop
1000 loops, best of 3: 1.18 ms per loop
1000 loops, best of 3: 1.86 ms per loop
100 loops, best of 3: 12.3 ms per loop
10 loops, best of 3: 21.2 ms per loop

So the fastest is f = operator.itemgetter(*idx); f(l)

Community
  • 1
  • 1
Sklavit
  • 2,225
  • 23
  • 29
  • The filter line doesn't do the right thing. E.g. if `l` is `[1,2,3,2,1,2,3,2]`, and `idx` is `[0,1,4,5]`, then the filter method will give you `[1, 2, 2, 1, 2, 2]` while all the other methods will give (correctly) `[1,2,1,2]`. Also, for consistency, you should wrap the `map` calls in `list()`. – Nick Matteo May 01 '17 at 17:38
  • @kundor Yes, regarding `filter` you are right in case of not unique values in list. – Sklavit May 03 '17 at 13:33
  • @Kundor, as for wrapping in `list` - it is not necessary due to this is Python 2.7. – Sklavit May 03 '17 at 13:35
  • Then why wrap `filter` in `list`? – Nick Matteo May 03 '17 at 15:22
4

Using numpy.asarray. Numpy allows getting subarray of array by list of indices.

>>> import numpy as np
>>> a = [0,10,20,30,40,50,60]
>>> b = [0,2,4,5]
>>> res = np.asarray(a)[b].tolist()
>>> res
[0, 20, 40, 50]
Temak
  • 2,929
  • 36
  • 49
1

Yet another alternative for better performance if that is important to you- it is by no means the most Pythonic but I am pretty sure it is the most efficient:

>>> list(filter(lambda x: a.index(x) in b, a))
[0, 2, 4, 5]

Note: You do not need to convert to a list in Python 2. However you do in Python 3 onwards (if any future visitors may have a similar issue).

anon582847382
  • 19,907
  • 5
  • 54
  • 57
  • Since the OP is using Python 2.7, you do not need to place `filter` in `list`. That is only in Python 3.x. –  Mar 14 '14 at 18:35
  • @iCodez Thank you, I have extended my answer. I converted it to a `list` to test my solution (I am using Python 3)- however I figured I would leave my solution as that as it will not cause an error in Python 2, whilst being applicable to a wider Python 3 audience. – anon582847382 Mar 14 '14 at 19:04