Keep in mind this is Python, not C, and sometimes the way it works is not necessarily intuitive. This means you have to verify your assumptions. I've done this using IPython's built-in %%timeit
magic.
a = [1,2,2,3,4,5]
%timeit b=[x for x in a if x > 2]\
1000000 loops, best of 3: 362 ns per loop
%timeit c=[x for x in a if not (x > 2)]
1000000 loops, best of 3: 371 ns per loop
So just under 800 ns for both, but we're iterating over the loop twice. Surely we don't need to do that? How about a couple of the above methods? Let's start with classify, which is pretty straightforward and only walks over the list once:
%timeit b, a = classify(a, lambda x: x > 2)
1000000 loops, best of 3: 1.89 µs per loop
Wow, even though it only walks over the loop once, it takes more than twice as long as the simple solution above, which walks it twice. Let's try a slight variation on the other solution proposed above:
%%timeit
b, c = [], []
for x in a:
b.append(x) if x > 2 else a.append(x)
1000000 loops, best of 3: 1.2 µs per loop
Better, but it's still slower than our 'naive'/'inefficient' implementation. Maybe a little different formulation would be better:
%%timeit
b, c = [], []
for x in a:
if x > 2:
b.append(x)
else:
c.append(x)
1000000 loops, best of 3: 983 ns per loop
Hmm, that seems almost the same, but is a little faster. Still doesn't beat the naive implementation. Let's get really clever, maybe make it even faster:
%%timeit
b, c = [], []
for x in a:
(b, c)[x > 2].append(x)
1000000 loops, best of 3: 1.28 µs per loop
So, what we saw is that despite our desire not to iterate the loop twice, we don't seem to be able to improve on just doing two list comprehensions. List comprehensions are sort of 'under the hood' in Python, and so for many things they are going to faster than anything you come up with.
Now, the x < 2
comparison is cheap - no function calls, simple math. There will be a point where it starts to make sense. Let's create a more expensive comparison function - we'll calculate the factorial (and do it inefficiently):
def factorial(x):
if x in (0,1):
return 1
else:
return x * factorial(x-1)
%timeit b = [x for x in a if factorial(x) > 6]
100000 loops, best of 3: 3.47 µs per loop
%timeit c = [x for x in a if not factorial(x) > 6]
100000 loops, best of 3: 3.53 µs per loop
So, obviously the times went up a good bit - now are at about 7uS for everything.
Let's try some of our other examples now:
%timeit b, c = classify(a, lambda x: factorial(x) > 6)
100000 loops, best of 3: 5.05 µs per loop
%%timeit
b, c = [], []
for x in a:
if factorial(x) > 6:
b.append(x)
else:
c.append(x)
100000 loops, best of 3: 4.01 µs per loop
%%timeit
b, c = [], []
for x in a:
(b, c)[factorial(x) > 6].append(x)
100000 loops, best of 3: 4.59 µs per loop
The lesson from all of this: When dealing with python & efficiency, it's usually a good idea to just try it out in a console, but very often the naive implementations are reasonably performant & the easiest to read. And a quick test will tell you if trying to optimize is actually worth it; you can often make it less readable AND slower if you're not careful....
Addendum: Someone commented that we need longer lists, because we're measuring function call overhead more than performance. They have a good point, but the times show about the same relationship on my machine:
In [16]: a = range(100000)
In [17]: random.shuffle(a)
In [18]: %timeit b = [x for x in a if x > 50000]
100 loops, best of 3: 5.2 ms per loop
In [19]: %timeit c = [x for x in m if not x > 50000]
100 loops, best of 3: 5.18 ms per loop
In [20]: %timeit c = [x for x in m if x <= 50000]
100 loops, best of 3: 5.35 ms per loop
In [21]: %%timeit
....: b, c = [], []
....: for x in a:
....: if x > 50000:
....: b.append(x)
....: else:
....: c.append(x)
....:
100 loops, best of 3: 12.7 ms per loop
Note that if I change the comparison to x > 2 (instead of x > 50000) the second loop sped up to about 11.4ms. But still, that's barely any faster than the naive implementation. That said, it's probably the one I'd prefer - it's still easy to read, and it's not slower (or significantly so). And code tends to get more complex over time, so when you later add another comparison, or change it to a function call, it's easier to do, and the performance will suffer less in this version than the naive version.
But again: This isn't to say 'use list comprehensions' (though they are very often a good idea), but verify your assumptions.