In the following code, I assume I have two generators yielding sorted and comparable values, and I want to make a generator that yields "synchronized" pairs from the two. By synchronized I mean yielding from both when they yield the same value, advancing only the "delayed" one otherwise (pairing what it yields with None
).
from itertools import repeat
def generate_pairs(g1, g2):
try:
n1 = next(g1)
except StopIteration:
yield from zip(repeat(None), g2)
# A
# raise StopIteration
try:
n2 = next(g2)
except StopIteration:
yield from zip(g1, repeat(None))
# A
# raise StopIteration
while True:
if n1 > n2:
yield (None, n2)
try:
n2 = next(g2)
except StopIteration:
yield (n1, None)
yield from zip(g1, repeat(None))
# B
# raise StopIteration
elif n1 < n2:
yield (n1, None)
try:
n1 = next(g1)
except StopIteration:
yield (None, n2)
yield from zip(repeat(None), g2)
# B
# raise StopIteration
else:
yield (n1, n2)
try:
n1 = next(g1)
except StopIteration:
yield from zip(repeat(None), g2)
# C
# raise StopIteration
try:
n2 = next(g2)
except StopIteration:
yield from zip(g1, repeat(None))
# C
# raise StopIteration
Where should I explicitly raise StopIteration
?
In the above state, when I try with already synchronized generators, I see that raising in case C is required.
pairs = generate_pairs((n1 for n1 in [1, 2, 3]), (n2 for n2 in [1, 2, 3]))
The above can go on yielding the last pair (3, 3)
forever:
from cytoolz import take
list(take(10, pairs))
Output:
[(1, 1),
(2, 2),
(3, 3),
(3, 3),
(3, 3),
(3, 3),
(3, 3),
(3, 3),
(3, 3),
(3, 3)]
In B too, it seems a manual StopIteration
should be raised:
pairs = generate_pairs((n1 for n1 in [1, 3]), (n2 for n2 in [1, 2]))
list(take(10, pairs))
Output:
[(1, 1),
(None, 2),
(3, None),
(None, 2),
(3, None),
(None, 2),
(3, None),
(None, 2),
(3, None),
(None, 2)]
And from the test below, it seems to me that some kind of way of ending the generator is required at A too:
pairs = generate_pairs((_ for _ in []), (n2 for n2 in [1, 2, 3]))
list(take(10, pairs))
Output:
UnboundLocalError Traceback (most recent call last)
<ipython-input-96-61eb4df81d52> in <module>
----> 1 list(take(10, pairs))
<string> in generate_pairs(g1, g2)
UnboundLocalError: local variable 'n1' referenced before assignment
However, if I uncomment all the raise StopIteration
in the code, I need to handle the resulting exceptions manually. They are not automatically handled in for loops, for instance.
I would just like my generator of pairs to stop generating things once both input generators have been exhausted, without drama. What did I get wrong?
Edit
It seems that using return
instead of raise StopIteration
fixes my code nicely. I'm still interested in some explanations, though.