Complement of list comprehension in python

Question

I'm wondering if there is not a way to compute the complement of a list comprehension in Python. Something like:

evens = [i in range(10) if i % 2 == 0]
odds  = [i in range(10) if i % 2 != 0]

is there a way to get both evens and odds in one call? For a very large list, or a more expensive if statement, I think this would save a lot of time.

I would recommend looking into sets: https://docs.python.org/2/library/sets.html — Nicholas Flees, May 13 '14 at 20:50
I'm not sure what you mean by complement. Do you mean a way to divide a list into two lists based on a condition? — David Robinson, May 13 '14 at 20:52
@khachik I'd like to separate the true cases from the false ones, I'd like them to be in separate containers in the end. — Mike, May 13 '14 at 20:52
Sets come equipped with a `difference` method via which the complement of the set is easily accessible. — Nicholas Flees, May 13 '14 at 20:54
I think the best way to solve this problem is just to not use a list comprehension. It would probably scale better (what if you have a condition with more than two return values) and would be less roundabout and consequently probably more efficient. — Waleed Khan, May 13 '14 at 20:55
@WaleedKhan, agree, just use one loop to fill both lists if performance is an issue. — khachik, May 13 '14 at 20:56
Possible duplicate: [Python: split a list based on a condition](http://stackoverflow.com/q/949098/1199226) — itsjeyd, May 13 '14 at 20:56
I was going to post http://pastebin.com/Eqsg8AjV, but question is already locked, unfortunately. My answer doesn't really fit linked question, but fit this one. I leave it here in case someone is interested generator/itertools based approach. — m.wasowski, May 13 '14 at 21:51

merlin2011 · Accepted Answer · 2014-05-13T21:51:12.100

4

I believe this question has been asked before, but I am not finding the link currently.

If you are trying to get more than one predicate and you only want to iterate once over the original generator, then you will have to use a simple for loop.

evens = []
odds = []

for i in xrange(10):
   if i % 2 == 0: evens.append(i)
   else: odds.append(i)

As @dawg pointed out, the logic inside the loop can be made more concise using clever indexing.

for i in xrange(10):
   (evens,odds)[i%2].append(i)

edited May 13 '14 at 21:51

answered May 13 '14 at 20:54

merlin2011

71,677
44
195
329

This does seem to be the most obvious way to get there, no sure why I didn't see it before, thanks. – Mike May 13 '14 at 20:56
1

How about `(evens,odds)[i%2].append(i)` instead of your if/else – dawg May 13 '14 at 21:29
@dawg, That is more concise, but arguably less readable. I will append it though. – merlin2011 May 13 '14 at 21:47

score 1 · Answer 2 · answered May 13 '14 at 20:56

1

itertools.groupby is what I'd use.

In [1]: import itertools as it

In [2]: key = lambda i: i%2 == 0

In [3]: l = list(range(10))

In [4]: l.sort(key=key)

In [5]: [list(i[1]) for i in it.groupby(l, key=key)]
Out[5]: [[1, 3, 5, 7, 9], [0, 2, 4, 6, 8]]

answered May 13 '14 at 20:56

utdemir

26,532
10
62
81

Note that if you're concerned about the `if` condition being expensive (as the OP is), this will have far more `if` evaluations than the original one would. (Sorting requires asymptotically about `n log (n)` comparisons, and then another `n` evaluations have to be evaluated again for the groupby, compared to a total of `2n` comparisons for the original). – David Robinson May 13 '14 at 21:00
@DavidRobinson, AFAIK Python's built-in `sort` function calls `key` argument once for each input; so, it's just `n` conditions. And Python's `timsort` is pretty fast for long ascending/descending runs. – utdemir May 13 '14 at 21:13

score 0 · Answer 3 · answered May 13 '14 at 20:53

0

I would do one of the following:

evens = [i in range(10) if i % 2 == 0]
odds  = [i in range(10) if i not in evens]

Or with better performances:

evens = [i in range(10) if i % 2 == 0]
evens_set = set(evens)
odds = [i in range(10) if i not in evens_set]

Working with set is better in performance, as the not in query costs O(1) instead of O(n) in lists

answered May 13 '14 at 20:53

SomethingSomething

11,491
17
68
126

Sets would be nice here, but I'm often working with unhashable types, like a list of dicts. – Mike May 13 '14 at 20:55
If the list items are not hashable, you'll need to pay the O(n) complexity for each query on the other list, or make some workaround with hashing – SomethingSomething May 13 '14 at 20:59

score 0 · Answer 4 · answered May 13 '14 at 20:58

In short, you can get both True and False cases in one call, but you'd still need to split them into two lists. You could do

range_10 = range(10)
odds = range_10[1::2]
evens = range_10[::2]

but the benefit of that would be negligible. (In fact, you'd be creating three lists instead of two). You'd only want to do that if the cost of range(10) was so high that it would offset creating two lists.

Using slicing like I did should be slightly faster than using a test and explicitly appending.

Complement of list comprehension in python

4 Answers4