0

I'm learning to use generators in Python 3 which also I just started learning. So I'm doing an experiment to see how much time do generators really save. Below is my code.

import time


def fib(n):
    a = b = 1
    result = []
    for i in range(n):
        result.append(a)
        a, b = b, a + b
    return result


def fib_gen(n):
    a = b = 1
    for i in range(n):
        yield a
        a, b = b, a + b


def calc(n):
    start_time = time.perf_counter()
    fib(n)
    end_time = time.perf_counter() - start_time
    print(f'fib({n})',end_time, 'secs')

    start_time2 = time.perf_counter()
    fib_gen(n)
    end_time2 = time.perf_counter() - start_time2
    print(f'fib_gen({n})',end_time2, 'secs')
    print('Improvement %ge ',((end_time-end_time2)/end_time)*100)


for x in range(1, 10000, 1000):
    calc(x)

Output:

fib(1) 4.976999999999759e-06 secs
fib_gen(1) 3.7599999999984868e-06 secs
Improvement %ge  24.452481414533484
fib(1001) 0.000278241999999998 secs
fib_gen(1001) 1.8640000000007262e-06 secs
Improvement %ge  99.33007957102065
fib(2001) 0.0006537540000000029 secs
fib_gen(2001) 1.4999999999980307e-06 secs
Improvement %ge  99.77055589717263
fib(3001) 0.0009883400000000007 secs
fib_gen(3001) 1.7100000000019877e-06 secs
Improvement %ge  99.82698261731773
fib(4001) 0.001559401999999998 secs
fib_gen(4001) 2.5110000000001798e-06 secs
Improvement %ge  99.8389767359539
fib(5001) 0.001501413 secs
fib_gen(5001) 1.582999999999446e-06 secs
Improvement %ge  99.89456598550835
fib(6001) 0.0019986470000000027 secs
fib_gen(6001) 1.186999999999716e-06 secs
Improvement %ge  99.94060982254497
fib(7001) 0.002645030999999999 secs
fib_gen(7001) 1.3070000000024729e-06 secs
Improvement %ge  99.95058659047842
fib(8001) 0.0035235960000000004 secs
fib_gen(8001) 3.18499999999583e-06 secs
Improvement %ge  99.90960938768248
fib(9001) 0.004458613 secs
fib_gen(9001) 5.944000000000782e-06 secs
Improvement %ge  99.866684998227

What I don't want to believe is that every time the generators are almost 99% better. How so? What do I need to do grasp it better.

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
avi
  • 1,847
  • 3
  • 16
  • 17
  • 2
    Calling `fib_gen(1001)` isn't actually doing anything. It's just producing a generator instance that hasn't done any work until you iterate over it. – Brad Solomon Oct 03 '19 at 22:39
  • Possible duplicate of [What does the "yield" keyword do?](https://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do) – Brad Solomon Oct 03 '19 at 22:41
  • 2
    In short: `list(fib_gen(3001))` would be the fairer comparison, serving to isolate `list.append()` versus using the list constructor on a generator object. – Brad Solomon Oct 03 '19 at 22:42
  • In cases like this, I think the %-improvement metric is flawed. If you were to tell me process B has a 99.9% speed improvement over process A, I would imagine that B finishes in just under half the time as A. That isn't the case here, not even close. Instead, you should take (`gen_time / no_gen_time`) and use that as your percent improvement. For your last case (9001), I get ~750. Meaning the generator call is 750 times faster than the non-generator call. That is a **750%** improvement. (Brad and jirassimok already answered **why** the difference is so high :-) ) – SyntaxVoid Oct 03 '19 at 23:12
  • Possible duplicate of [Why are generators faster?](https://stackoverflow.com/questions/31766728/why-are-generators-faster) – juanpa.arrivillaga Oct 03 '19 at 23:24
  • Please **always use** the generic python tag for all python related questions, use version-specific tags at your discretion. – juanpa.arrivillaga Oct 03 '19 at 23:24

1 Answers1

3

When you call a generator function, you aren't actually running the computations inside.

def generate_range(n):
    for i in range(n):
        yield i

print(generator()) # prints "<generator object f at 0x123456789>"

To get the values from a generator, you need to either call next on them, or iterate over them:

gen = generate_range(3)

print(next(gen)) # prints 0
print(next(gen)) # prints 1
print(next(gen)) # prints 2
print(next(gen)) # raises StopIteration

for x in generator(3):
    print(x) # prints 0, then 1, then 2

From that last example, you can also see that a generator yields its values in sequence, rather than all at once. This allows generators to be used for lazy computation—they don't perform any computations until they are asked for their next result.

Here's an example that demonstrates how generators are lazy:

import time

# In the real world, this could be any slow operation,
# like getting something from the internet.
def get_number_slowly(n):
    time.sleep(n ** 2)
    return n

def get_with_function(numbers):
    results = []
    for number in numbers:
        results.append(get_number_slowly(number))
    return results

def get_with_generator(numbers):
    for number in numbers:
        yield get_number_slowly(number)


numbers = [3, 2, 0, 3, 5, 1018]

# Find the first number greater than four using each of the two methods

for number in get_with_generator(numbers):
    print("Checking", number, "> 4")
    if number > 4:
        print(number, "is greater than 4")
        break

for number in get_with_function(numbers):
    print("Checking", number, "> 4")
    if number > 4:
        print(number, "is greater than 4")
        break

The two functions will have the same output, but the generator prints one line at a time, and waits a total of 47 seconds before printing "5 is greater than 4" and stopping. The get_with_function waits almost 12 days before printing the same outputs, because it calls get_number_slowly for all of the numbers it is given before it even starts the loop, so it has to wait 10182 seconds for the last number.

So if you want to do something in between different computations, or if you only want to do the computations for the data you need, generators can be very useful.

jirassimok
  • 3,850
  • 2
  • 14
  • 23