Slicing a Python list with a NumPy array of indices -- any fast way?

Question

I have a regular list called a, and a NumPy array of indices b.
(No, it is not possible for me to convert a to a NumPy array.)

Is there any way for me to the same effect as "a[b]" efficiently? To be clear, this implies that I don't want to extract every individual int in b due to its performance implications.

(Yes, this is a bottleneck in my code. That's why I'm using NumPy arrays to begin with.)

I don't know how much faster (if at all) `operator.itemgetter()` would be. — Ignacio Vazquez-Abrams, Jul 31 '16 at 00:18
What is your plan for (what would be) `a[b]`? It's hard to imagine a use for it that won't "extract an individual `int` for ever integer in `b`"... eventually. If you're concerned about wasting space having a list and a sublist hanging around simultaneously, it seems that you could iterate (or whatever) over `b` at the time of need, instead of the (would be) `a[b]`. — jedwards, Jul 31 '16 at 00:33
@jedwards: My sentence was a little ambiguous (fixed), but what I was say was that I'm trying to avoid extracting the individual elements of `b` (I don't need it and it slows my code down). I'm using the extracted elements of `a` afterward (e.g. looking at which ones are `None`, etc... it's not really relevant) but that hardly implies I need to extract `b`'s elements manually in the process. — user541686, Jul 31 '16 at 00:43
`itemgetter` creates a class instance with a callable. I think its speed improvement comes from a simpler interpretation or calling stack. It does not use any special C code (that I can see). I get, at best, a 2x speed improvement. — hpaulj, Jul 31 '16 at 00:59
For me (Python3, Numpy 1.10.4) -- `itemgetter` seems the fastest, a list comprehension over `b` being ~56% slower, and a comprehension over `np.nditer(b)` being ~134% slower. — jedwards, Jul 31 '16 at 01:00
`nditer` isn't a speed help, even when used for real numpy cases. It's a stepping stone toward compiled C code (e.g. via Cython). — hpaulj, Jul 31 '16 at 01:26

John Zwinck · Answer 1 · 2016-07-31T01:11:57.187

3

a = list(range(1000000))
b = np.random.randint(0, len(a), 10000)

%timeit np.array(a)[b]
10 loops, best of 3: 84.8 ms per loop

%timeit [a[x] for x in b]
100 loops, best of 3: 2.93 ms per loop

%timeit operator.itemgetter(*b)(a)
1000 loops, best of 3: 1.86 ms per loop

%timeit np.take(a, b)
10 loops, best of 3: 91.3 ms per loop

I had high hopes for numpy.take() but it is far from optimal. I tried some Numba solutions as well, and they yielded similar times--around 92 ms.

So a simple list comprehension is not far from the best here, but operator.itemgetter() wins, at least for input sizes at these orders of magnitude.

edited Jul 31 '16 at 01:11

answered Jul 31 '16 at 01:00

John Zwinck

239,568
38
324
436

This method seems to have an inconsistency... `itemgetter(*(0,))([(0, 0)])` returns a tuple of tuples, but `itemgetter(*(0,))([(0,)])` returns just the tuple... – user541686 Jul 31 '16 at 01:22
Also, +1, but interestingly this didn't end up actually being faster in my original code (in fact it was a little slower). – user541686 Jul 31 '16 at 01:25
@Mehrdad I'm getting a tuple returned by both of the above. – juanpa.arrivillaga Jul 31 '16 at 01:36
@juanpa.arrivillaga: But are both of them tuples of tuples? – user541686 Jul 31 '16 at 01:42
@Mehrdad No, I'm getting `(0,0)` and `(0,)`, respectively. – juanpa.arrivillaga Jul 31 '16 at 01:46
Ugh, sorry, it's because I typo'd. I meant `itemgetter(*(0,))([(0,)])` and `itemgetter(*(0,0))([(0,)])`. – user541686 Jul 31 '16 at 02:35

score 3 · Answer 2 · answered Jul 31 '16 at 01:45

Write a cython function:

import cython
from cpython cimport PyList_New, PyList_SET_ITEM, Py_INCREF

@cython.wraparound(False)
@cython.boundscheck(False)
def take(list alist, Py_ssize_t[:] arr):
    cdef:
        Py_ssize_t i, idx, n = arr.shape[0]
        list res = PyList_New(n)
        object obj

    for i in range(n):
        idx = arr[i]
        obj = alist[idx]
        PyList_SET_ITEM(res, i, alist[idx])
        Py_INCREF(obj)

    return res

The result of %timeit:

import numpy as np

al= list(range(10000))
aa = np.array(al)

ba = np.random.randint(0, len(a), 10000)
bl = ba.tolist()

%timeit [al[i] for i in bl]
%timeit np.take(aa, ba)
%timeit take(al, ba)

1000 loops, best of 3: 1.68 ms per loop
10000 loops, best of 3: 51.4 µs per loop
1000 loops, best of 3: 254 µs per loop

numpy.take() is the fastest if both of the arguments are ndarray object. The cython version is 5x faster than list comprehension.

Nice. Now I wonder why my Numba attempts were so slow. If you know Numba can you try to make a Numba function that's faster than the list comprehension? I couldn't. — John Zwinck, Jul 31 '16 at 01:49
@JohnZwinck, numba make a copy of the list internal : http://numba.pydata.org/numba-doc/dev/reference/pysupported.html#list — HYRY, Aug 01 '16 at 11:05

Slicing a Python list with a NumPy array of indices -- any fast way?

2 Answers2

Linked