12

Update: I've started a thread on python-ideas to propose additional syntax or a stdlib function for this purpose (i.e. specifying the first value sent by yield from). So far 0 replies... :/


How do I intercept the first yielded value of a subgenerator but delegate the rest of the iteration to the latter using yield from?

For example, suppose we have an arbitrary bidirectional generator subgen, and we want to wrap this in another generator gen. The purpose of gen is to intercept the first yielded value of subgen and delegate the rest of the generation—including sent values, thrown exceptions, .close(), etc.—to the sub-generator.

The first thing that might come to mind could be this:

def gen():
    g = subgen()

    first = next(g)
    # do something with first...
    yield "intercepted"

    # delegate the rest
    yield from g

But this is wrong, because when the caller .sends something back to the generator after getting the first value, it will end up as the value of the yield "intercepted" expression, which is ignored, and instead g will receive None as the first .send value, as part of the semantics of yield from.

So we might think to do this:

def gen():
    g = subgen()

    first = next(g)
    # do something with first...
    received = yield "intercepted"
    g.send(received)

    # delegate the rest
    yield from g

But what we've done here is just moving the problem back by one step: as soon as we call g.send(received), the generator resumes its execution and doesn't stop until it reaches the next yield statement, whose value becomes the return value of the .send call. So we'd also have to intercept that and re-send it. And then send that, and that again, and so on... So this won't do.

Basically, what I'm asking for is a yield from with a way to customize what the first value sent to the generator is:

def gen():
    g = subgen()

    first = next(g)
    # do something with first...
    received = yield "intercepted"

    # delegate the rest
    yield from g start with received  # pseudocode; not valid Python

...but without having to re-implement all of the semantics of yield from myself. That is, the laborious and poorly maintainable solution would be:

def adaptor(generator, init_send_value=None):
    send = init_send_value
    try:
        while True:
            send = yield generator.send(send)
    except StopIteration as e:
        return e.value

which is basically a bad re-implementation of yield from (it's missing handling of throw, close, etc.). Ideally I would like something more elegant and less redundant.

Anakhand
  • 2,838
  • 1
  • 22
  • 50
  • Is `x` None after you do: `x = yield 42`? – Dani Mesejo Dec 19 '20 at 11:47
  • Not necessarily, `x` can be anything the caller sends in. Using Python 3.9 – Anakhand Dec 19 '20 at 11:47
  • What Python are you using? Also how can be x anything the caller sends? – Dani Mesejo Dec 19 '20 at 11:48
  • I'm using Python 3.9. For example, if using `subgen` directly: `g = subgen(); v = next(g); v = g.send(123)`. In the last statement, we sent 123 to `subgen`, and so `x` was 123. Then the generator reached the next yield statement and yielded `x + 2`, i.e. `125`; so `v` is now `125`. Keep in mind that the first `send` is just to initialise the generator (i.e. its value doesn't appear anywhere in the generator) and must always be `.send(None)`, or the equivalent `next()`. – Anakhand Dec 19 '20 at 11:51
  • See [here](https://docs.python.org/3/reference/expressions.html#generator.send). "When send() is called to start the generator, it must be called with None as the argument, because there is no yield expression that could receive the value." But, after that, "the value argument becomes the result of the current yield expression." – Anakhand Dec 19 '20 at 11:59
  • I'm probably missing it, but why do you expect x to be 42 and not None after x = yield 42 ? Why not to redo your subgen using following pattern (separating lines with semicolon): x = 42; yield x; x += 2; yield x. – MjH Dec 23 '20 at 08:01
  • Adding to myself above. If you want to somehow alter sub-generator based on first value it produces, then use x = yield 42 only once and then yield x ongoing? – MjH Dec 23 '20 at 08:11
  • @MjH The `subgen` above is just a dummy example; in fact, since it's causing more confusion than anything, I'll remove it from the question. And I didn't say I expect x to be 42, quite the contrary: I don't have any expectations for its value, as it can be anything the caller sends with [`.send()`](https://docs.python.org/3/reference/expressions.html#generator.send). – Anakhand Dec 23 '20 at 11:22

2 Answers2

2

If you're trying to implement this generator wrapper as a generator function using yield from, then your question basically boils down to whether it is possible to specify the first value sent to the "yielded from" generator. Which it is not.

If you look at the formal specification of the yield from expression in PEP 380, you can see why. The specification contains a (surprisingly complex) piece of sample code that behaves the same as a yield from expression. The first few lines are:

_i = iter(EXPR)
try:
    _y = next(_i)
except StopIteration as _e:
    _r = _e.value
else:
    ...

You can see that the first thing that is done to the iterator is to call next() on it, which is basically equivalent to .send(None). There is no way to skip that step and your generator will always receive another None whenever yield from is used.

The solution I've come up with is to implement the generator protocol using a class instead of a generator function:

class Intercept:
    def __init__(self, generator):
        self._generator = generator
        self._intercepted = False

    def __next__(self):
        return self.send(None)

    def send(self, value):
        yielded_value = self._generator.send(value)

        # Intercept the first value yielded by the wrapped generator and 
        # replace it with a different value.
        if not self._intercepted:
            self._intercepted = True

            print(f'Intercepted value: {yielded_value}')

            yielded_value = 'intercepted'

        return yielded_value

    def throw(self, type, *args):
        return self._generator.throw(type, *args)

    def close(self):
        self._generator.close()

__next__(), send(), throw(), close() are described in the Python Reference Manual.

The class wraps the generator passed to it when created will mimic its behavior. The only thing it changes is that the first value yielded by the generator is replaced by a different value before it is returned to the caller.

We can test the behavior with an example generator f() which yields two values and a function main() which sends values into the generator until the generator terminates:

def f():
    y = yield 'first'
    print(f'f(): {y}')

    y = yield 'second'
    print(f'f(): {y}')

def main():
    value_to_send = 0
    gen = f()

    try:
        x = gen.send(None)

        while True:
            print(f'main(): {x}')

            # Send incrementing integers to the generator.
            value_to_send += 1
            x = gen.send(value_to_send)
    except StopIteration:
        print('main(): StopIteration')    
      
main()

When ran, this example will produce the following output, showing which values arrive in the generator and which are returned by the generator:

main(): first
f(): 1
main(): second
f(): 2
main(): StopIteration

Wrapping the generator f() by changing the statement gen = f() to gen = Intercept(f()), produces the following output, showing that the first yielded value has been replaced:

Intercepted value: first
main(): intercepted
f(): 1
main(): second
f(): 2

As all other calls to any of the generator API are forwarded directly to the wrapped generator, it should behave equivalently to the wrapped generator itself.

Feuermurmel
  • 9,490
  • 10
  • 60
  • 90
  • "The specification contains a (surprisingly complex) piece of sample code"—yes, that's precisely why I didn't want to re-implement `yield from` :) I've realised that, as you say, there is currently no solution using `yield from`. And I think this approach is the best we can achieve without (re-)writing too much code. Thanks! – Anakhand Dec 24 '20 at 13:46
-1

If I understand the question, I think this works? Meaning, I ran this script and it did what I expected, which was to print all but the first line of the input file. But as long as the generator passed as the argument to the skip_first function can be iterator over, it should work.

def skip_first(thing):
    _first = True
    for _result in thing:
        if _first:
        _   first = False
            continue
        yield _result

inp = open("/var/tmp/test.txt")

for line in skip_first(inp):
    print(line, end="")
cnamejj
  • 114
  • 1
  • 3
  • This completely ignores `.send`, `.throw`, `.close`... [Python generators are more than a simple iterator](https://docs.python.org/3/reference/expressions.html#generator-iterator-methods). (And the intent is to intercept and transform the first yielded value, not to skip it entirely, but this is secondary.) – Anakhand Dec 24 '20 at 09:27
  • I use class based generators, so trying to squeeze it into a function was a challenge. :) – cnamejj Dec 24 '20 at 09:31