I expected that in the case of multiple loops list iteration will be much faster than using a generator, and my code suggests this is false.
My understanding is (by operation I mean any expression defining an element):
- a list requires n operations to be initialized
- but then every loop over the list is just grabbing an element from memory
- thus, m loops over a list require only n operations
- a generator does not require any operations to be initialized
- however, looping over generator runs operations in fly
- thus, one loop over a generator requires n operations
- but m loops over a generator require n x m operations
And I checked my expectations using the following code:
from timeit import timeit
def pow2_list(n):
"""Return a list with powers of 2"""
results = []
for i in range(n):
results.append(2**i)
return results
def pow2_gen(n):
"""Generator of powers of 2"""
for i in range(n):
yield 2**i
def loop(iterator, n=1000):
"""Loop n times over iterable object"""
for _ in range(n):
for _ in iterator:
pass
l = pow2_list(1000) # point to a list
g = pow2_gen(1000) # point to a generator
time_list = \
timeit("loop(l)", setup="from __main__ import loop, l", number=10)
time_gen = \
timeit("loop(g)", setup="from __main__ import loop, g", number=10)
print("Loops over list took: ", time_list)
print("Loops over generator took: ", time_gen)
And the results surprised me...
Loops over list took: 0.20484769299946493
Loops over generator took: 0.0019217690005461918
Somehow using generators appears much faster than lists, even when looping over 1000 times. And in this case we are talking about two orders of magnitude! Why?
EDIT:
Thanks for answers. Now I see my mistake. I wrongly assumed that generator starts from beginning on a new loop, like range:
>>> x = range(10)
>>> sum(x)
45
>>> sum(x)
45
But this was naive (range is not a generator...).
Regarding possible duplicate comment: my problem concerned multiple loops over generator, which is not explained in the other thread.