is there a way to capture misses on a list comprehension?

Question

Based on a simple list comprehension :

yay = [ i for i in a if a[i] ]
nay = [ i for i in a if not a[i] ]

I am wondering if there is a way to assign both the yay and nay values at once ( ie hits and misses on the conditional )?

Something that would look like this

( yay , nay ) = ...

I was curious about this on readability and speed ( I was a bit surprised to see two list comprehensions are about 5% faster than a single for-loop that appends to either list )

update:

The original example was to grab a listing of "true" and "false" valued keys in a dict...

a = {i: i >= 50 for i in range(100)}

yay = [k for k, v in a.items() if v]
nay = [k for k, v in a.items() if not v]

Should that be `[ i for i in a if i ]`? what you have there seems horribly convoluted :) -- An iterable of indices which you then use to index itself... phew! — mgilson, Feb 11 '13 at 19:08
@mgilson Yeah, I'm not really sure what the aim was, I actually presumed it was intended to be a [i for i in a if f(i)] type of thing, where it was just implying an arbitrary function. — Gareth Latty, Feb 11 '13 at 19:09
sorry , i cut the top of the example off. `a` was is a dict of true/false values. — Jonathan Vanasco, Feb 11 '13 at 19:10
@JonathanVanasco Then doing `a[i]` doesn't really make sense. — Gareth Latty, Feb 11 '13 at 19:10
@Lattyware -- I think he's looking for all of the keys in a dict which have True/False values associated with them -- `i` was not a good choice for a variable name in that case as it implies an integer ;-) — mgilson, Feb 11 '13 at 19:11
@mgilson I got that, my comment was meant to imply doing `[k for k, v in a.items() if v]` would be preferable. — Gareth Latty, Feb 11 '13 at 19:13
Your list of True and False can be used with `itertools.compress` directly if I understand your problem — JBernardo, Feb 11 '13 at 19:17
@Lattyware -- Yeah, I see where you're going with this now :). — mgilson, Feb 11 '13 at 19:21

Duncan · Answer 1 · 2013-02-12T16:06:36.680

7

The usual solution here is not to get all hung up on the idea of using a list comprehension. Just use a for loop:

yay, nay = [], []
for i in a:
    if somecondition(i):
        yay.append(i)
    else:
        nay.append(i)

If you find yourself doing this a lot then simply move the code out into a function:

def yesno(seq, cond):
    yay, nay = [], []
    for i in seq:
        if cond(i):
            yay.append(i)
        else:
            nay.append(i)
    return yay, nay

yay, nay = yesno(a, lambda x: a[x])

The comments suggest this is slower than list comprehension. Passing the condition as a lambda is inevitably going to have a big hit and I don't think you can do much about that, but some of the performance hit probably comes from looking up the append method and that can be improved:

def yesno(seq, cond):
    yay, nay = [], []
    yes, no = yay.append, nay.append
    for i in seq:
        if cond(i):
            yes(i)
        else:
            no(i)
    return yay, nay

I don't know if that makes much of a difference, but it might be interesting to time it.

In the comments @martineau suggests using a generator and consuming it with any(). I'll include that here, but I would replace any with the itertools recipe to consume an iterator:

def consume(iterator, n):
    "Advance the iterator n-steps ahead. If n is none, consume entirely."
    # Use functions that consume iterators at C speed.
    if n is None:
        # feed the entire iterator into a zero-length deque
        collections.deque(iterator, maxlen=0)
    else:
        # advance to the empty slice starting at position n
        next(islice(iterator, n, n), None)

and then you can write:

yay, nay = [], []
consume((yay if a[i] else nay).append(i) for i in a)

edited Feb 12 '13 at 16:06

answered Feb 11 '13 at 19:10

Duncan

92,073
11
122
156

1

+1, Not sure why you got a down-vote, this is actually a good solution here. It only does one loop over the list instead of two, which could make it better in some cases, and it's inherently pretty readable. – Gareth Latty Feb 11 '13 at 19:14
+1. I used a `defaultdict` variant of this to answer a similar question: `partition(a, a.get)` would do it -- see [here](http://stackoverflow.com/questions/12720151/simple-way-to-group-items-into-buckets). – DSM Feb 11 '13 at 19:15
Oh, as a side note, for the function variant, I'd make it a generator, rather than building lists. – Gareth Latty Feb 11 '13 at 19:16
Thanks @Lattyware. Maybe the downvote was because the question asked for a list comprehension as an answer, but the other answers don't use a list comprehension either – Duncan Feb 11 '13 at 19:16
I thought about generator, but you want both lists. You could do a function returning two generators but then you need to use `tee` or equivalent to buffer the results. I think here it pays not to be too clever. – Duncan Feb 11 '13 at 19:18
The question does not ask for a list-comprehension. But I think the OP already tried this solution and noted that it is slower than two list-comprehensions: " I was a bit surprised to see two list comprehensions are about 5% faster than a single for-loop that appends to either list" – Bakuriu Feb 11 '13 at 19:18
i think the lambda approach is along the lines of what i was looking for. I have a handful of these comparisons in my code. the list comprehensions have been the most readable, which is why i've gone with them so far. – Jonathan Vanasco Feb 11 '13 at 19:18
As OP mentioned, using LC might have a performance edge over using loop and append. – Abhijit Feb 11 '13 at 19:19
@Abhijit Not if using the LC means doing two loops over the data, when this only does one. It depends if there is more overhead from the repeated `list.append()`s, or from doing two separate loops over the data. The more data, the more likely this is a better solution. – Gareth Latty Feb 11 '13 at 19:23
I've updated my answer to suggest pulling the `x.append` lookup out of the list. I haven't timed it though so I don't know how much effect it would have. – Duncan Feb 11 '13 at 19:24
Speed wasn't a concern, just an curiosity - on an older machine it takes 10k benches to account for a .03 difference in timing. That sort of speed discrepancy is beyond irrelevant. – Jonathan Vanasco Feb 11 '13 at 19:27
@JonathanVanasco If speed isn't a concern, then definitely just go for your original version (or, the slightly better version as suggested in the comments) - it's clear and simple. – Gareth Latty Feb 11 '13 at 19:28
yeah i am. i was wondering if there were other ways , and if they would be more readable. clearly there are, and clearly they're not. – Jonathan Vanasco Feb 11 '13 at 19:29
Jonathan Vanasco: something like `[(yay if somecondition(i) else nay).append(i) for i in a]` is shorter and arguably more readable. So your original example could be rewritten: `yay, nay = [], []; [(yay if a[i] else nay).append(i) for i in a]`. – martineau Feb 11 '13 at 20:28
@martineau, but that builds 3 lists and one of those is simply discarded. Not nice. – Duncan Feb 12 '13 at 08:41
1

@Duncan: Building the throw-away list can be avoided with `yay, nay = [], []; any((yay if a[i] else nay).append(i) for i in a)`. – martineau Feb 12 '13 at 15:45

Abhijit · Answer 2 · 2013-02-11T19:24:09.503

4

I would still say, your way of doing is more readable and should be the suggested approach, but in any case, if you are looking for alternative, you can look forward with a solution from itertools

>>> from itertools import compress, imap
>>> from operator import not_
>>> yay, nay = compress(a,a.values()), compress(a, imap(not_,a.values()))

edited Feb 11 '13 at 19:24

answered Feb 11 '13 at 19:17

Abhijit

62,056
18
131
204

1

You need to make that `compress(a, a.values())`, and I'd argue the second would be more readable as `compress(a, (not i for i in a.values()))`. – Gareth Latty Feb 11 '13 at 19:20
@Lattyware: Yes Typo corrected :-). Well regarding your suggested alternative, I am bit skeptical but you are free to edit this answer, with your proposal. – Abhijit Feb 11 '13 at 19:25

Gareth Latty · Answer 3 · 2013-02-11T19:25:24.973

0

This could be done with something like this:

yay, nay = zip(*[(k, None) if v else (None, k) for k, v in a.items()])
yay, nay = filter(None, yay), filter(None, nay)

As to if it would be faster... maybe for huge lists. Otherwise, it's probably not going to matter.

Naturally, if None is a value in your lists, you will need to swap it out for another sentinel and do an identity check with filter().

edited Feb 11 '13 at 19:25

answered Feb 11 '13 at 19:05

Gareth Latty

86,389
17
178
183

Sure, it's different, but IMHO it's much less clear that the original code. – Geoff Reedy Feb 11 '13 at 19:10
@GeoffReedy It's marginally less readable, yes, and I'd agree that in general, degraded performance but more readable is a better idea. For extremely large lists in a performance sensitive environment though, this might be better. – Gareth Latty Feb 11 '13 at 19:12

kojiro · Answer 4 · 2013-02-11T19:13:20.667

You might be able to use a dict comprehension, but I'm reasonably sure you can't* use list comprehension to do what you ask. Assuming the data is or can be sorted** I would probably use itertools.groupby.

results = itertools.groupby(sorted_a, bool)

*Qualification: OK, Lattyware's answer shows that you can, but it also generates a tuple with an None value for each member of the iterable. IMO that's a lot of waste. While I confess I didn't even consider that, I'm not ashamed that I didn't.

**Sorted: It needs to be sorted by the same key as it's grouped by.

asermax · Answer 5 · 2013-02-11T23:11:27.147

0

It's not pretty, but you could do something among this lines:

nay = []
yay = [foo for foo in foos if condition or nay.append(foo)]

This takes advantage of the short-circuit on the or operator.

edited Feb 11 '13 at 23:11

answered Feb 11 '13 at 22:23

asermax

3,053
2
23
28

eyquem · Answer 6 · 2013-02-12T19:31:31.907

EDIT

Oh well, I wrote a solution that was pretty the same than one of the Duncan's ones. So I delete what I wrote and I let what I consider to be the best solution, mixing one Duncan's solution and the proposal of martineau (use of any( ) appears to me a lot more preferable to the use of list( ) or a list comprehension as the one I had written; very good idea the any( ), that's better than the complication of importing consume( ) IMO)

def disting(L):
        dust,right = [],[]
        dustapp = dust.append
        rightapp = right.append
        any(rightapp(x) if x else dustapp(x) for x in L)
        return right,dust

for seq in ((10,None,'a',0,None,45,'qada',False,True,0,456),
            [False,0,None,104,True,str,'',88,'AA',__name__]):
    yay,nay = disting(seq)     
    print 'seq == %r\nyay == %r\nnay == %r' % (seq,yay,nay)
    print '---------------------------------------'

result

seq == (10, None, 'a', 0, None, 45, 'qada', False, True, 0, 456)
yay == [10, 'a', 45, 'qada', True, 456]
nay == [None, 0, None, False, 0]
---------------------------------------
seq == [False, 0, None, 104, True, <type 'str'>, '', 88, 'AA', '__main__']
yay == [104, True, <type 'str'>, 88, 'AA', '__main__']
nay == [False, 0, None, '']
---------------------------------------

By the way, using any( ) works because rightapp(x) and dustapp(x) return None. In case a True or equivalent to True would be returned, the iteration inside any( ) would stop !

is there a way to capture misses on a list comprehension?

6 Answers6

Related