1

After reading answer1 and answer2, purpose of yield still looks unclear.


In this first case, with the below function,

def createGenerator():
   mylist = range(3)
   for i in mylist:
      yield i*i

On invoking createGenerator, below,

myGenerator = createGenerator()

should return object(like (x*x for x in range(3))) of type collections.abc.Generator type, is-a collections.abc.Iterator & collections.abc.Iterable

To iterate over myGenerator object and get first value(0),

next(myGenerator)

would actually make for loop of createGenerator function to internally invoke __iter__(myGenerator) and retrieve collections.abc.Iterator type object( obj(say) ) and then invoke __next__(obj) to get first value(0) followed by the pause of for loop using yield keyword


If this understanding(above) is correct, then,

then, does the below syntax(second case),

def createGenerator():
   return (x*x for x in range(3))
myGen = createGenerator() # returns collections.abc.Generator type object
next(myGen) # next() must internally  invoke __next__(__iter__(myGen)) to provide first value(0) and no need to pause

wouldn't suffice to serve the same purpose(above) and looks more readable? Aren't both syntax memory efficient? If yes, then, when should I use yield keyword? Is there a case, where yield could be a must use?

overexchange
  • 15,768
  • 30
  • 152
  • 347
  • 1
    Generator expressions are *comprehension constructs*. They are much more restrictive on the types of things you can do inside of them. For example, you can't have compound statements. There are ways around this, but I consider this analogous to "When should I use a for-loop" vs "when should I use a comprehension". Use the one that is more readable at the time, or the one that makes your life easier. – juanpa.arrivillaga Jun 15 '17 at 23:09
  • 2
    What happens when you don't have uniform data to return? Yes, when your generator is just a glorified wrapper around an iterator you don't need yield, but more often than not that's not the case. – zwer Jun 15 '17 at 23:11
  • 1
    No, not at all. I find the `yield` syntax *very* clear. – juanpa.arrivillaga Jun 15 '17 at 23:11
  • 2
    Try to make the thing returned by `createGenerator` accept new information each time `next` is called and you will then understand why yield exists. In the example you gave, you knew the things you wanted to have the generator spit out when you wrote the code, but some times (often) you need to be able to pass stuff into the generator, let it compute something, and yield that new computed thing. – DanielSank Jun 15 '17 at 23:12
  • 1
    Same reason we have `def` instead of trying to write all our functions with `lambda`. Same reason we don't create every list with a list comprehension. Genexps are syntactically very limited; they can't express much. – user2357112 Jun 15 '17 at 23:15
  • For example, perhaps your algorithm is better expressed using recursion. While there are hacky ways to accomplish this in a generator expression, they definitely are not what I would consider *readable*. – juanpa.arrivillaga Jun 15 '17 at 23:18
  • Recursion usually isn't a good option in Python. Only it only if the recursion depth is limited (as in the number of dimensions of an array). I think the built in max dept is about 50. – hpaulj Jun 16 '17 at 00:07
  • 1
    @hpaulj certainly one should not write Python like Haskell or Scala, but that doesn't mean recursion doesn't have it's critical use-cases where it is the most straight-forward way to implement something. Also, if your algorithm is logarithmic, then you probably aren't going to reach the recursion limit (which is default 1000). Check out the example I just posted. This is a great case to use recursion. The algorithm is very clear, and if you have data-structures nested anywhere near 1000 levels, you've got other problems... – juanpa.arrivillaga Jun 16 '17 at 00:12
  • @hpaulj So, check out the answers to [this question](https://stackoverflow.com/questions/10823877/what-is-the-fastest-way-to-flatten-arbitrarily-nested-lists-in-python). The accepted answer uses recursion, but you can always write a recursive implementation as an iterative one - just use your *own stack*! The next question demonstrates this. However, look how much more complicated it becomes. I'll take 5 lines over 16 anyday when the trade-off is "you can go more than 1000 deep". – juanpa.arrivillaga Jun 16 '17 at 00:16

4 Answers4

4

Try doing this without yield

def func():
    x = 1
    while 1:
        y = yield x
        x += y


f = func()
f.next()  # Returns 1
f.send(3)  # returns 4
f.send(10)  # returns 14

The generator has two important features:

  1. The generator some state (the value of x). Because of this state, this generator could eventually return any number of results without using huge amounts of memory.

  2. Because of the state and the yield, we can provide the generator with information that it uses to compute its next output. That value is assigned to y when we call send.

I don't think this is possible without yield. That said, I'm pretty sure that anything you can do with a generator function can also be done with a class.

Here's an example of a class that does exactly the same thing (python 2 syntax):

class MyGenerator(object):
    def __init__(self):
        self.x = 1

    def next(self):
        return self.x

    def send(self, y):
        self.x += y
        return self.next()

I didn't implement __iter__ but it's pretty obvious how that should work.

DanielSank
  • 3,303
  • 3
  • 24
  • 42
  • So, when you invoke `func()`, `Generator` type object(say `obj`) gets returned. What does that object(`obj`) looks like? On that object, if you run `next()`, then `__next(__iter(obj))` should be invoked internally by `while`. How do I visualise that `Generator` type object, in your answer? In my query, I know my `Generator` type object is `(x*x for x in range(3))`, in both cases. Isn't it? – overexchange Jun 15 '17 at 23:28
  • @overexchange See the most recent edit for an example of a class that behaves exactly like the generator function. Perhaps this will give you a mental model of how to visualize the generator, as you requested. – DanielSank Jun 15 '17 at 23:46
  • This [answer](https://stackoverflow.com/a/237028/3317808) says, each `yield` is replaced with list `append()`. So, how would I write a function replacing `yield` keyword? – overexchange Jun 17 '17 at 22:35
  • @overexchange The answer you linked is incomplete at best, and I'd say it's actually flat out wrong. That answer does not show how to get the effect of `send`ing into a generator. Please read my answer again. – DanielSank Jun 17 '17 at 22:39
  • For such [examples](https://pastebin.com/raw/YAx1qiLV), probably replacing with `append()` would make sense, where there is no chance to `send()` – overexchange Jun 17 '17 at 23:06
  • @overexchange Even appending is very different from a generator. Suppose I want a generator that gives the sequence `1, 2, 3, ...` forever to infinity. Obviously, we cannot append all of the positive integers into a list because our computer does not have infinite memory. This is the simplest benefit of a generator: *we compute values as needed instead of trying to compute them all at once*. – DanielSank Jun 17 '17 at 23:18
  • I can handle all such cases using class syntax. As you also mentioned in the answer. So, why would I need generator function with yield keyword? Coding style aspect? – overexchange Jun 18 '17 at 00:07
  • @overexchange look at the class and the function in my answer. The class is already twice as many lines as the function, and that's without implementing `__iter__`. So yes, I *think* the generator function is, in the end, just nice syntax. Note, however that 1) this nice syntax is *really* convenient, and 2) there *may* be cases where the `yield` keyword actually does something that can't be done with a class that I simply don't know about. – DanielSank Jun 18 '17 at 08:46
  • I read PEP 342, which says, *Coroutines are a natural way of expressing...*. I think, it is not about, whether we need `yield` key word, but code maps out mental model. – overexchange Jun 19 '17 at 23:04
1

Think of yield as a "lazy return". In your second example, your function does not return a "generator of values", but rather a fully evaluated list of values. This may be perfectly acceptable depending on the use case. Yield is useful when proccessing large batches of streamed data, or when dealing with data that is not immediately available (think asynchronous operations).

drkstr101
  • 760
  • 1
  • 6
  • 23
0

The generator function and the generator comprehension are basically same - both produce generator objects:

In [540]: def createGenerator(n):
     ...:     mylist = range(n)
     ...:     for i in mylist:
     ...:         yield i*i
     ...:         
In [541]: g = createGenerator(3)
In [542]: g
Out[542]: <generator object createGenerator at 0xa6b2180c>

In [545]: gl = (i*i for i in range(3))
In [546]: gl
Out[546]: <generator object <genexpr> at 0xa6bbbd7c>

In [547]: list(g)
Out[547]: [0, 1, 4]
In [548]: list(gl)
Out[548]: [0, 1, 4]

Both g and gl have the same attributes; produce the same values; run out in the same way.

Just as with a list comprehension, there are things you can do in the explicit loop that you can't with the comprehension. But if the comprehension does the job, use it. Generators were added to Python sometime around version 2.2. Generator comprehensions are newer (and probably use the same underlying mechanism).

In Py3 range, or Py2 xrange produces values one at a time, as opposed to a whole list. It's a range object, not a generator, but works in much the same way. Py3 has extended this in other ways, such as the dictionary keys and map. Sometimes that's a convenience, other times I forget to wrap them in the list().


The yield can be more elaborate, allowing 'feedback' for the caller. e.g.

In [564]: def foo(n):
     ...:     i = 0
     ...:     while i<n:
     ...:         x = yield i*i
     ...:         if x is None:
     ...:             i += 1
     ...:         else:
     ...:             i = x
     ...:             

In [576]: f = foo(3)
In [577]: next(f)
Out[577]: 0
In [578]: f.send(-3)    # reset the counter
Out[578]: 9
In [579]: list(f)
Out[579]: [4, 1, 0, 1, 4]

The way I think of an generator operating is that creation initializes an object with code and initial state. next() runs it up to the yield, and returns that value. The next next() lets it spin again until it hits a yield, and so on until it hits a stop iteration condition. So it's a function that maintains an internal state, and can called repeatedly with the next or for iteration. With send and yield from and so on generators can be much more sophisticated.

Normally a function runs until done, and returns. The next call to the function is independent of the first - unless you use globals or error prone defaults.


https://www.python.org/dev/peps/pep-0289/ is the PEP for generator expressions, from v 2.4.

This PEP introduces generator expressions as a high performance, memory efficient generalization of list comprehensions [1] and generators [2] .

https://www.python.org/dev/peps/pep-0255/ PEP for generators, v.2.2

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • 1
    I think it's important to note that a major feature associated with `yield` is that it brings data into the generator. This is in line with the distinction between an explicit loop and a comprehension because an explicit loop can have a line that waits for I/O or whatever. – DanielSank Jun 15 '17 at 23:51
  • Even I have to look up the syntax for more elaborate uses of yield - such as my new example using `send` to reset the counter. – hpaulj Jun 16 '17 at 00:03
0

There already is a good answer about the capability to send data into a generator with yield. Regarding just readability considerations, while certainly simple, straightforward transformations can be more readable as generator expressions:

(x + 1 for x in iterable if x%2 == 1)

Certain operations are easier to read and understand using a full generator definition. Certain cases are a headache to fit into a generator expression, try the following:

>>> x = ['arbitrarily', ['nested', ['data'], 'can', [['be'], 'hard'], 'to'], 'reach']
>>> def flatten_list_of_list(lol):
...     for l in lol:
...         if isinstance(l, list):
...             yield from flatten_list_of_list(l)
...         else:
...             yield l
...
>>> list(flatten_list_of_list(x))
['arbitrarily', 'nested', 'data', 'can', 'be', 'hard', 'to', 'reach']

Sure, you might be able to hack up a solution that fits on a single line using lambdas to achieve recursion, but it will be an unreadable mess. Now imagine I had some arbitrarily nested data-structure that involved list and dict, and I have logic to handle both cases... you get the point I think.

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
  • `yield from` doesn't run on Py2. It bugs me that is was needlessly added to the Py3 version of `argparse` (replacing a `for i in x: yield i` with a `yield from x`). – hpaulj Jun 16 '17 at 00:49
  • @hpaulj I like the `yield from` syntax. I think it is very Pythonic. Although I learned to love Python on Py2, I've fully embraced Py3 by now. – juanpa.arrivillaga Jun 16 '17 at 08:48