16

I have a generator function that goes something like this:

def mygenerator():
    next_value = compute_first_value() # Costly operation
    while next_value != terminating_value:
        yield next_value
        next_value = compute_next_value()

I would like the initialization step (before the while loop) to run as soon as the function is called, rather than only when the generator is first used. What is a good way to do this?

I want to do this because the generator will be running in a separate thread (or process, or whatever multiprocessing uses) and I won't be using the return for a short while, and the initialization is somewhat costly, so I would like it to do the initialization while I'm getting ready to use the values.

Ryan C. Thompson
  • 40,856
  • 28
  • 97
  • 159

5 Answers5

17

I needed something similar. This is what I landed on. Push the generator function into an inner and return it's call.

def mygenerator():
    next_value = compute_first_value()

    def generator():
        while next_value != terminating_value:
            yield next_value
            next_value = compute_next(next_value)

    return generator()
Jason A
  • 553
  • 1
  • 6
  • 8
  • for what it's worth this pattern is used in stdlib a few times, see https://github.com/python/cpython/blob/d5650a1738fe34f6e1db4af5f4c4edb7cae90a36/Lib/concurrent/futures/_base.py#L591 for an example with the map function for thread pool executors – GPhys Sep 28 '21 at 21:03
15
class mygenerator(object):
    def __init__(self):
        self.next_value = self.compute_first_value()
    def __iter__(self):
        return self
    def next(self):
        if self.next_value == self.terminating_value:
            raise StopIteration()
        return self.next_value
Andrey Sboev
  • 7,454
  • 1
  • 20
  • 37
  • 1
    This is the "right" way. The others are hacks. They may be fun on their own to create and use, shorter to write, but hard to read and mind boggling to maintain. – jsbueno Apr 20 '11 at 03:54
  • This is the right answer (making a generator class), but I think it's missing a few things. `next_value` and a few others should probably be instance variables (i.e. `self.next_value`), and the update step (`compute_next_value()`) is missing. – Ryan C. Thompson May 30 '18 at 16:09
9

You can create a "preprimed" iterator fairly easily by using itertools.chain:

from itertools import chain

def primed(iterable):
    """Preprimes an iterator so the first value is calculated immediately
       but not returned until the first iteration
    """
    itr = iter(iterable)
    try:
        first = next(itr)  # itr.next() in Python 2
    except StopIteration:
        return itr
    return chain([first], itr)

>>> def g():
...     for i in range(5):
...         print("Next called")
...         yield i
...
>>> x = primed(g())
Next called
>>> for i in x: print(i)
...
0
Next called
1
Next called
2
Next called
3
Next called
4
ncoghlan
  • 40,168
  • 10
  • 71
  • 80
5

I suppose you can yield None after that first statement is completed, then in your calling code:

gen = mygenerator()
next(gen) # toss the None
do_something(gen)
Steve Howard
  • 6,737
  • 1
  • 26
  • 37
0

For my use case I used a modified version of @ncoghlan answer but wrapped in a factory function to decorate the generating function:

import collections, functools, itertools

def primed_generator(generating_function):
    @functools.wraps(generating_function)
    def get_first_right_away_wrapper(*args,**kw):
        "call the generator function, prime before returning"
        gen = generating_function(*args,**kw)
        assert isinstance(gen,collections.Iterator)
        first_value = next(gen)
        return itertools.chain((first_value,),gen)
    return get_first_right_away_wrapper

Then just decorate the function:

@primed_generator
def mygenerator():
    next_value = compute_first_value() # Costly operation
    while next_value != terminating_value:
        yield next_value
        next_value = compute_next_value()

and the first value will be calculated immediately, and the result is transparent.

Community
  • 1
  • 1
Tadhg McDonald-Jensen
  • 20,699
  • 5
  • 35
  • 59