What you're trying to do isn't exactly impossible, it's just complicated, and probably wasteful.
If you want to partition an iterable into two iterables, if the source is a list or other re-usable iterable, you're probably better off either doing it in two passes, as in your question.
Even if the source is an iterator, if the output you want is a pair of lists, not a pair of lazy iterators, either use Martijn's answer, or do two passes over list(iterator)
.)
But if you really need to lazily partition an arbitrary iterable into two iterables, there's no way to do that without some kind of intermediate storage.
Let's say you partition [1, 2, -1, 3, 4, -2]
into positives
and negatives
. Now you try to next(negatives)
. That ought to give you -1
, right? But it can't do that without consuming the 1
and the 2
. Which means when you try to next(positives)
, you're going to get 3
instead of 1
. So, the 1
and 2
need to get stored somewhere.
Most of the cleverness you need is wrapped up inside itertools.tee
. If you just make positives
and negatives
into two teed copies of the same iterator, then filter them both, you're done.
In fact, this is one of the recipes in the itertools
docs:
def partition(pred, iterable):
'Use a predicate to partition entries into false entries and true entries'
# partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9
t1, t2 = tee(iterable)
return filterfalse(pred, t1), filter(pred, t2)
(If you can't understand that, it's probably worth writing it out explicitly, with either two generator functions sharing an iterator and a tee via a closure, or two methods of a class sharing them via self
. It should be a couple dozen lines of code that doesn't require anything tricky.)
And you can even get partition
as an import from a third-party library like more_itertools
.
Now, you can use this in a one-liner:
lst = [1, 2, -1, 3, 4, -2]
positives, negatives = partition(lst, lambda x: x>=0)
… and you've got an iterator over all the positive values, and an iterator over all of the negative values. They look like they're completely independent, but together they only do a single pass over lst
—so it works even if you assign lst
to a generator expression or a file or something instead of a list.
So, why isn't there some kind of shortcut syntax for this? Because it would be pretty misleading.
A comprehension takes no extra storage. That's the reason generator expressions are so great—they can transform a lazy iterator into another lazy iterator without storing anything.
But this takes O(N)
storage. Imagine all of the numbers are positive, but you try to iterate negative
first. What happens? All of the numbers get pushed to trueq
. In fact, that O(N)
could even be infinite (e.g., try it on itertools.count()
).
That's fine for something like itertools.tee
, a function stuck in a module that most novices don't even know about, and which has nice docs that can explain what it does and make the costs clear. But doing it with syntactic sugar that made it look just like a normal comprehension would be a different story.