3

Please see the below snippet, run with Python 3.10:

from collections.abc import Generator

DUMP_DATA = 5, 6, 7

class DumpData(Exception):
    """Exception used to indicate to yield from DUMP_DATA."""

def sample_gen() -> Generator[int | None, int, None]:
    out_value: int | None = None
    while True:
        try:
            in_value = yield out_value
        except DumpData:
            yield len(DUMP_DATA)
            yield from DUMP_DATA
            out_value = None
            continue
        out_value = in_value

My question pertains to the DumpData path where there is a yield from. After that yield from, there needs to be a next(g) call, to bring the generator back to the main yield statement so we can send:

def main() -> None:
    g = sample_gen()
    next(g)  # Initialize
    assert g.send(1) == 1
    assert g.send(2) == 2

    # Okay let's dump the data
    num_data = g.throw(DumpData)
    data = tuple(next(g) for _ in range(num_data))
    assert data == DUMP_DATA

    # How can one avoid this `next` call, before it works again?
    next(g)
    assert g.send(3) == 3

How can this extra next call be avoided?

Intrastellar Explorer
  • 3,005
  • 9
  • 52
  • 119

3 Answers3

3

When you yield from a tuple directly, the built-in tuple_iterator (which sample_gen delegates to) handles an additional "final value" yield before it terminates. It does not have a send method (unlike generators in general) and returns a final value None to sample_gen.

The behavior:

yield from DUMP_DATA  # is equivalent to:
yield from tuple_iterator(DUMP_DATA)
def tuple_iterator(t):
    for item in t:
        yield item
    return None

You can implement tuple_iterator_generator, with usage:

try:
    in_value = yield out_value
except DumpData:
    yield len(DUMP_DATA)
    in_value = yield from tuple_iterator_generator(DUMP_DATA)
out_value = in_value
def tuple_iterator_generator(t):
    in_value = None
    for item in t:
        in_value = yield item
    return in_value

Or just not use yield from if you don't want that behavior:

try:
    in_value = yield out_value
except DumpData:
    yield len(DUMP_DATA)
    for out_value in DUMP_DATA:
        in_value = yield out_value
out_value = in_value

See https://docs.python.org/3/whatsnew/3.3.html#pep-380-syntax-for-delegating-to-a-subgenerator for a use case of that behavior.

aaron
  • 39,695
  • 6
  • 46
  • 102
  • IMO this is wrong. The `None` value is not coming from some tuple iterator. It is the value assigned in the line `out_value = None` after the `yield from`. – VPfB Jul 01 '23 at 05:55
  • I made a test with a different value *before* leaving the comment. The answer by @GoodCoderBBoy states that too - independently from my test. – VPfB Jul 01 '23 at 19:05
  • @VPfB Thanks for the correction, the explanation was wrong and I have fixed that. The code correctly demonstrates the behavior though. – aaron Jul 02 '23 at 07:49
2

This is a lot of effort to go to, to remove a single line of code.

Note that this is a bad solution if DUMP_DATA is a large object, or doesn't support slicing, because slicing will cause all of DUMP_DATA to be stored in memory, before being yielded, which defeats the point of using a generator.

As stated in https://stackoverflow.com/a/26109157/15081390, yield from "establishes a transparent bidirectional connection between the caller and the sub-generator". Calling next(g) terminates this and allows the loop to continue. In doing so it receives (and discards) out_value, which is set to None after yield from DUMP_DATA. In the OP's code, we can substitute this call to next for the last one when defining the variable data:

data = tuple(next(g) for _ in range(num_data))
             ^^^^^^^

All that is needed is be able to detect the end of DUMP_DATA. If DUMP_DATA is a Sequence (supports subscripting), then we can use yield from DUMP_DATA[:i-1] to yield from all but the last element, which will be yielded normally (by assigning DUMP_DATA[-1] to out_value and re-entering sample_gen's normal loop). Thus, when the final line in main is called, the generator will respond normally.

from collections.abc import Generator

DUMP_DATA = 5, 6, 7

class DumpData(Exception):
    """Exception used to indicate to yield from DUMP_DATA."""

def sample_gen() -> Generator[int | None, int, None]:
    out_value: int | None = None
    while True:
        try:
            in_value = yield out_value
        except DumpData:
            # yield length, but not before storing in var i
            yield (i := len(DUMP_DATA))

            # if length is more than one item, then yield the first n - 1 elements
            if i > 1:
                yield from DUMP_DATA[:i-1]

            # in case DUMP_DATA is of length 0, don't try to yield it
            if i:
                out_value = DUMP_DATA[-1]

            continue

        out_value = in_value
def main() -> None:
    g = sample_gen()
    next(g)  # Initialize

    assert g.send(1) == 1
    assert g.send(2) == 2

    # Okay let's dump the data
    num_data = g.throw(DumpData)

    # the last call of next(g) exits the yield from state
    data = tuple(next(g) for _ in range(num_data))

    assert data == DUMP_DATA

    # no need to call next(g)
    assert g.send(3) == 3

if __name__ == "__main__":
    main() # executes fine
GoodCoderBBoy
  • 91
  • 2
  • 5
1

You have to wrap your inner generator in a "plain" generator which has the send method. This will remove, for this level, the small optimizations of using yield from, since you are back to Python code iterating a generator and yielding a value - but that is the only way to accept the value sent by the next iteration after the inner generator is eaxhausted.

That said, it is straightforward:

...

def inner_gen(gen):
    for item in gen:
        incoming = yield item
    return incoming

def sample_gen() -> Generator[int | None, int, None]:
    out_value: int | None = None
    while True:
        try:
            in_value = yield out_value
        except DumpData:
            yield len(DUMP_DATA)
            out_value = yield from inner_gen(DUMP_DATA)
            continue
        out_value = in_value
...
def main() -> None:
    g = sample_gen()
    next(g)  # Initialize
    assert g.send(1) == 1
    assert g.send(2) == 2

    # Okay let's dump the data
    num_data = g.throw(DumpData)
    data = tuple(next(g) for _ in range(num_data))
    assert data == DUMP_DATA

    # This `send` value will be taken into the "inner_gen" ,
    # and used as the return value of the `yield from` expression.
    assert g.send(3) == 3

Of course, this only works because you know before hand the number of elements the yield from will produce, and call send after receiving the last value and before it stops with StopIteration. After a generator is exhausted, its return value (usually None) is already produced, and there is no way to pre-emptively "ask for a value" from the code driving a generator without yielding to it.

jsbueno
  • 99,910
  • 10
  • 151
  • 209