I read some question and answers about differences between iterators and generators. But I don't understand when you should choose one over other? Do you know any examples (simple, real life ones) when one is better than the other? Thank you.
-
4What specifically do you not understand after reading the post? – jkd Mar 29 '15 at 01:25
-
I am new to python. Most things is new to me. Simple examples for a noob are better than writing that 'Python’s generators provide a convenient way to implement the iterator protocol' or other technical gorgon. – clappersturdy Mar 29 '15 at 01:27
-
Some knowledge of technical jargon is required, the more you know the better you'll be able to find answers to questions and present questions to get more and better answers. To answer your question, check out [Vincent Driessen's](http://nvie.com/posts/iterators-vs-generators/) post on the interator vs generator question. – LinkBerest Mar 29 '15 at 01:39
-
As I explained in that (top-voted but still unaccepted:-) answer nearly five years ago, all generators are iterators (though not vice versa) so your question is absolutely bereft of any sense -- like asking "when should I eat spaghetti rather than pasta" when spaghetti **are** an example (instance, special case of) pasta, or asking "should I buy a Labrador or a dog" (since Labradors **are** a breed of dogs), and so forth. – Alex Martelli Mar 29 '15 at 01:48
-
Brevity, genexp -> generator. Extensibility, flexibility -> iterator. – Shashank Mar 29 '15 at 01:48
-
@AlexMartelli I hope my isn't too far off point, I didn't read the linked question/answer before starting my own here -- and should have. (Also, if I may, [this](http://stackoverflow.com/questions/29322237/are-unbound-descriptors-possible) seems like something you may know something about, if you're bored :) – jedwards Mar 29 '15 at 02:25
1 Answers
Iterators provide efficient ways of iterating over an existing data structure.
Generators provide efficient ways of generating elements of a sequence on the fly.
Iterator Example
Python's file readers can be used as iterators. So what you might use to process one line of a file:
with open('file.txt', 'rb') as fh:
lines = fh.readlines() # this reads the entire file into lines, now
for line in lines:
process(line) # something to be done on each line
You can implement more efficiently using iterators
with open('file.txt', 'rb') as fh:
for line in fh: # this will only read as much as needed, each time
process(line)
The advantage is in the fact that in the second example, you're not reading the entire file into memory, then iterating over a list of lines. Instead, the reader (BufferedReader
in Python3) is reading a line at a time, every time you ask for one.
Generator Example
Generators generate elements of a sequence on the fly. Consider the following:
def fib():
idx = 0
vals = [0,1]
while True:
# If we need to compute a new value, do so on the fly
if len(vals) <= idx: vals.append(vals[-1] + vals[-2])
yield vals[idx]
idx += 1
This is an example of a generator. In this case, every time it's "called" it produces the next number in the Fibonacci sequence.
I put "called" in scare quotes because the method of getting successive values from generators is different than a traditional function.
We have two main ways to get values from generators:
Iterating over it
# Print the fibonacci sequence until some event occurs
for f in fib():
print(f)
if f > 100: break
Here we use the in
syntax to iterate over the generator, and print the values that are returned, until we get a value that's greater than 100.
Output:
0 1 1 2 3 5 8 13 21 34 55 89 144
Calling next()
We could also call next
on the generator (since generators are iterators) and (generate and) access the values that way:
f = fib()
print(next(f)) # 0
print(next(f)) # 1
print(next(f)) # 1
print(next(f)) # 2
print(next(f)) # 3
There are more persuasive examples of generators however. And these often come in the form of "generator expressions", a related concept (PEP-289).
Consider something like the following:
first = any((expensive_thing(i) for i in range(100)))
Here, we're creating a generator expression:
(expensive_thing(i) for i in range(100))
And passing it to the any
built-in function. any
will return True
as soon as an element of the iterable is determined to be True
. So when you pass a generator function to any
, it will only call expensive_thing(i)
as many times as necessary to find a True
-ish value.
Compare this with using a list comprehension passed to any
:
first = any([expensive_thing(i) for i in range(100)])
In this case, expensive_thing(i)
will be called for all values of i
, first, then the 100-element list of True
/False
values will be given to any
which will return True
if it finds a True
-ish value.
But if expensive_thing(0)
returned True
, clearly the better approach would only be to evaluate that, test it, and stop there. Generators allow you to do this, whereas something like a list comprehension do not.
Consider the following example, illustrating the advantage of using a generator expression over list comprehension:
import time
def expensive_thing(n):
time.sleep(0.1)
return 10 < n < 20
# Find first True value, by using a generator expression
t0 = time.time()
print( any((expensive_thing(i) for i in range(100))) )
t1 = time.time()
td1 = t1-t0
# Find first True value, by using a list comprehension
t0 = time.time()
print( any([expensive_thing(i) for i in range(100)]) )
t1 = time.time()
td2 = t1-t0
print("TD 1:", td1) # TD 1: 1.213068962097168
print("TD 2:", td2) # TD 2: 10.000572204589844
The function expensive_thing
introduces an artificial delay to illustrate the difference between the two approaches. The second (list comprehension) approach takes significantly longer, because expensive_thing
is evaluated at all 100 indices, whereas the first only calls expensive_thing
until it finds a True
values (i=11
).
-
1The Python 3.x `range` and the Python 2.x `xrange` are not generators. They are lazy sequences. The definition of a generator is a function that produces an iterator. See https://docs.python.org/3/glossary.html#term-generator – Mar 29 '15 at 01:43
-
2For the purposes of understanding the concept, they can 100% be thought of as generators, and since these built-ins are likely ones the OP is familiar with, I used them -- but I'll update my post with a proper generator. – jedwards Mar 29 '15 at 01:45