How to efficiently extract specific subsets of list elements in Python

Question

I have a a length N list of floats calledvals and a length N list of 0s and 1s called bits. I want to extract two lists: the elements of vals that correspond to the 0s in bits, and the remaining elements that correspond to the 1s in bits. I'm currently doing:

valsbits = zip(vals,bits)

els0 = [v for v,b in valsbits if b == 0]

els1 = [v for v,b in valsbits if b == 1]

but there must be a better way. Also, I'm doing this for many different bits vectors, so there may be a clever way to do this entire operation.

You should measure its performance to be certain that it is inefficient before taking steps to change it. However, nothing about this strikes me as inefficient at all. — arshajii, Sep 10 '13 at 19:14
I have. One thing that strikes me as inefficient is that once I know which elements of vals correspond to the 0s in bits, I know which elements correspond to the 1s, so I shouldn't need to do the whole O(N) walk again. — Big Dogg, Sep 10 '13 at 19:18
O(N) + O(N) = O(N). You don't save anything complexity-wise by only iterating once. And as mentioned, you really should benchmark if performance is the issue. — Robert Jørgensgaard Engdahl, Sep 10 '13 at 20:39
In worst case analysis yes, but in reality I'd like to only do N operations if I can avoid doing 2*N operations. — Big Dogg, Sep 10 '13 at 21:07

score 4 · Accepted Answer · edited May 23 '17 at 12:11

You can use itertools.compress, which yields the elements that correspond to true in the selector.

However this would require to duplicate the bits and invert a copy to select the elements for the zeros, which would end up with:

from operator import not_
true_values = list(compress(sequence, bits))
false_values = list(compress(sequence, map(not_, bits)))

I believe using a simple for loop would be easier and faster, since it does a single iteration:

true_values = []
false_values = []
for bit, val in zip(bits, values):
    if bit:
        true_values.append(val)
    else:
        false_values.append(val)

Just as a curiosity, here are some micro benchmark with the various solutions:

In [12]: import random

In [13]: value = 'a' * 17000

In [14]: selectors = [random.randint(0, 1) for _ in range(17000)]

In [15]: %%timeit
    ...: true_values = [v for v,b in zip(value, selectors) if b == 1]
    ...: false_values = [v for v,b in zip(value, selectors) if b == 0]
    ...: 
100 loops, best of 3: 2.56 ms per loop

In [16]: %%timeit
    ...: true_values = []
    ...: false_values = []
    ...: for bit,val in zip(selectors, value):
    ...:     if bit:
    ...:         true_values.append(val)
    ...:     else:
    ...:         false_values.append(val)
    ...: 
1000 loops, best of 3: 1.87 ms per loop

In [17]: %%timeit
    ...: res = {}
    ...: for val, bit in zip(value, selectors):
    ...:     res.setdefault(bit, []).append(val)
    ...: true_values, false_values = res.get(1, []), res.get(0, [])
    ...: 
100 loops, best of 3: 3.73 ms per loop

In [18]: from collections import defaultdict

In [19]: %%timeit
    ...: res = defaultdict(list)
    ...: for val, bit in zip(value, selectors):
    ...:     res[bit].append(val)
    ...: true_values, false_values = res.get(1, []), res.get(0, [])
    ...: 
100 loops, best of 3: 2.05 ms per loop
In [26]: %%timeit  # after conversion to numpy arrays
    ...: true_values = values[selectors == 0]
    ...: false_values = values[selectors == 1]
    ...: 
1000 loops, best of 3: 344 us per loop
In [31]: %%timeit
    ...: res = [[], []]
    ...: for val, bit in zip(value, selectors):
    ...:     res[bit].append(val)
    ...: true_values, false_values = res
    ...: 
100 loops, best of 3: 2.09 ms per loop
In [34]: from operator import not_

In [35]: %%timeit
    ...: true_values = list(compress(value, selectors))
    ...: false_values = list(compress(value, map(not_, selectors)))
    ...: 
1000 loops, best of 3: 1.44 ms per loop

Obviously numpy is much faster then the rest, assuming that you can replace the python lists with numpy arrays.

It seems like itertools.compress is the fastest non-3rd-party solution, at 1.44 ms. The second fastest is the naive for with an if-else, at 1.87, with other solutions taking slightly more than 2 ms.

Increasing the number of elements the only changes I see is that the Jon Clement's defaultdict(list) solution and newtower's [[], []] solution become marginally faster than the naive for+if-else (like2% faster at 500000). compress is still 30% faster than the others and numpy is still about 4x faster than compress.

Does this difference matter to you? If not(and profile to check whether it is a bottleneck!) I'd simply consider using the more readable solution, which is pretty much subjective and up to you.

A last remark on the timings I've obtained:

Even though both compress and your double list-comprehension iterate over the list twice, one is the fastest non-3rd party solution, and the other is the slowest. Here you can see the difference between a "python-level loop", or "explicit loop", versus a "C-level loop", or "implicit loop".

itertools.compress is implemented in C and this allows it to iterate without much of the intepreter overhead. And as you can see this makes a huge difference.

You can see this even more in the numpy solution, which also performs two iterations instead of one. In this case, not only the loop is "at the C level", but it also completely avoid calling python APIs to iterate over the arrays, since numpy has its own C data-types.

This is pretty much a rule in CPython: to improve performance try to replace explicit loops with implicit ones, using built-in functions or functions definined in C extensions.

Guido van Rossum is well-aware of this, try to read his An Optimization Anecdote.

you can find another such example in this SO question(disclaimer: the accepted answer is mine. I've exploited bisection search and string equality(-> the C-level built-in) to obtain a solution faster than a pure-python linear search).

`compress()` is only half a solution. It won't return the `false_values`. — Sven Marnach, Sep 10 '13 at 19:20
@SvenMarnach I believe you did *not* read the code I wrote that uses `compress`. I just double checked on ipython and it *does* return the elements that correspond to `0`s and `1`s correctly. — Bakuriu, Sep 10 '13 at 19:22
I commented on the first version of your answer. I'd suggest `map(operator.not_, bits)` instead of the list comprehension to improve performance. — Sven Marnach, Sep 10 '13 at 19:54
Thanks a lot, this is what I was looking for. I did the benchmarking before I asked the question (hence the question) and using numpy arrays yielded a huge speedup. Thanks for all the benchmark comparisons and the pointer to ``compress``. — Big Dogg, Sep 10 '13 at 22:40

score 2 · Answer 2 · answered Sep 10 '13 at 19:11

2

You can try using numpy arrays for better performance:

els0 = vals[bits == 0]
els1 = vals[bits == 1]

answered Sep 10 '13 at 19:11

Lev Levitsky

63,701
20
147
175

Thanks! This is what I ended up doing. Sped up my application a lot. – Big Dogg Sep 11 '13 at 01:32

score 2 · Answer 3 · answered Sep 10 '13 at 19:15

2

You could use the following (or just use a collections.defaultdict(list)):

res = {}
for val, bit in zip(vals, bits):
    res.setdefault(bit, []).append(val)
zeros, ones = res.get(0, []), res.get(1, [])

It only scans the list once, and groups on more than true/false values, but does require auxiliary storage for the new lists.

answered Sep 10 '13 at 19:15

Jon Clements

138,671
33
247
280

Looks like `res` should be a `defaultdict(list)`. – agf Sep 10 '13 at 19:18
1

+1 I was about to post this one.(You can also use defaultdict) – Ashwini Chaudhary Sep 10 '13 at 19:18
Yup - was in the mood to not include imports :) – Jon Clements Sep 10 '13 at 19:20

Steve K · Answer 4 · 2013-09-10T19:35:46.800

0

[val for idx, val in enumerate(values) if bits[idx]]

will get you the subset of values matching the 1s. To get a list with the two subsets (0 and 1s), you could write

true_vals, false_vals = [[val for idx, val in enumerate(values) if bits[idx]==bit] for bit in [0, 1]]

or

true_vals, false_vals = [[val[0] for val in zip(values, bits) if val[1]==bit] for bit in [0, 1]]

edited Sep 10 '13 at 19:35

answered Sep 10 '13 at 19:26

Steve K

10,879
4
39
39

What about a nested list comprehension like in my second example ? No one seems to avoid doing the same thing twice (that is, writing one line per bit value) – Steve K Sep 10 '13 at 19:35

How to efficiently extract specific subsets of list elements in Python

4 Answers4