TL;DR is what I'm trying to do too complicated for a yield-based generator?
I have a python application where I need to repeat an expensive test on a list of objects, one at a time, and then mangle those that pass. I expect several objects to pass, but I do not want to create a list of all those that pass, as mangle will alter the state of some of the other objects. There is no requirement to test in any particular order. Then rinse and repeat until some stop condition.
My first simple implementation was this, which runs logically correctly
while not stop_condition:
for object in object_list:
if test(object):
mangle(object)
break
else:
handle_no_tests_passed()
unfortunately, for object in object_list:
always restarts at the beginning of the list, where the objects probably haven't been changed, and there are objects at the end of the list ready to test. Picking them at random would be slightly better, but I would rather carry on where I left off from the previous for/in call. I still want the for/in call to terminate when it's traversed the entire list.
This sounded like a job for yield, but I tied my brain in knots failing to make it do what I wanted. I can use it in the simple cases, iterating over a range or returning filtered records from some source, but I couldn't find out how to make it save state and restart reading from its source.
I can often do things the long wordy way with classes, but fail to understand how to use the alleged simplifications like yield. Here is a solution that does exactly what I want.
class CyclicSource:
def __init__(self, source):
self.source = source
self.pointer = 0
def __iter__(self):
# reset how many we've done, but not where we are
self.done_this_call = 0
return self
def __next__(self):
ret_val = self.source[self.pointer]
if self.done_this_call >= len(self.source):
raise StopIteration
self.done_this_call += 1
self.pointer += 1
self.pointer %= len(self.source)
return ret_val
source = list(range(5))
q = CyclicSource(source)
print('calling once, aborted early')
count = 0
for i in q:
count += 1
print(i)
if count>=2:
break
else:
print('ran off first for/in')
print('calling again')
for i in q:
print(i)
else:
print('ran off second for/in')
which demonstrates the desired behaviour
calling once, aborted early
0
1
calling again
2
3
4
0
1
ran off second for/in
Finally, the question. Is it possible to do what I want with the simplified generator syntax using yield, or does maintaining state between successive for/in calls require the full class syntax?