Why is using a generator function twice as fast in this case?

Question

The code which is common for both the implementations:

from math import sqrt

def factors(x):
    num = 2
    sq = int(sqrt(x))
    for i in range(2, sq):
        if (x % i) == 0:
            num += 2
    return num + ((1 if sq == sqrt(x) else 2) if x % sq == 0 else 0)

1. Implementation which doesn't make use of a generator function:

i = 1
while True:
    if factors(i * (i+1) * 0.5) > 500:
        print(int(i * (i+1) * 0.5))
        break
    i += 1

2. Implementation which makes use of a generator function:

def triangle():
    i = 1
    while True:
        yield int(0.5 * i * (i + 1))
        i += 1

t = triangle()

while True:
    num = t.__next__()
    if factors(num) > 500:
        print(num)
        break

The Question:

The first implementation takes about 4 seconds while the second one takes approximately 8.2 seconds. Why is there such a big difference between the run times of the two implementations?

Does it matter that it's a generator function, or is this just another case of [Python code is faster when you put it in a function](http://stackoverflow.com/questions/11241523/why-does-python-code-run-faster-in-a-function)? — BrenBarn, Jun 22 '15 at 07:05
*The first implementation takes about 4 seconds while the second one takes approximately 8.2 seconds* - so is your post title correct, or are the order of your code blocks correct? — Jon Clements, Jun 22 '15 at 07:06
The implementation with generator is faster and takes around 4 seconds — Pruthvi Raj, Jun 22 '15 at 07:21
@SaratAdusumilli: My question isn't about which implementation makes conceptual sense. If you take your first implementation and put the whole `while True` thing into a function, is it faster? If so, the difference doesn't have to do with using a generator function, but just with using a function, as described in the post I linked to. — BrenBarn, Jun 22 '15 at 07:24
@BrenBarn I believe the only (good) way to write the above program using a function instead of a generator function would be by making use of a variable defined in the module. This would run slower than the one with a generator function. A generator function is perfect for this program. I don't think there would be much of a difference when you put the code in a function. (The difference is in milliseconds in the question you linked to.) — Sarat Adusumilli, Jun 22 '15 at 07:29
@SaratAdusumilli: No, the difference between function and non-function versions in that question is a few seconds. — BrenBarn, Jun 22 '15 at 07:34

Pruthvi Raj · Answer 1 · 2015-06-22T09:15:56.357

temp1():

def temp1():
        i = 1
        while True:
            if factors(i * (i+1) * 0.5) > 500:
                print(int(i * (i+1) * 0.5))
                break
            i += 1

temp2():

def temp2():
    def triangle():
        i = 1
        while True:
            yield int(0.5 * i * (i + 1))
            i += 1

    t = triangle()

    while True:
        num = t.next()
        if factors(num) > 500:
            print(num)
            break

cProfile for both:

enter image description here After changing the factors call in temp1() to factors(int(...)), It turns out that temp1() takes the similar time

Modified temp1 to pass int rather than float:

def temp1():
    i = 1
    while True:
        if factors(int(i * (i+1) * 0.5)) > 500:
            print(int(i * (i+1) * 0.5))
            break
        i += 1

enter image description here

So it turns out that in your first implementation you are passing float to the factors() and floating point arithmetic is complex than integer arithmetic

Why Floating point operations are complex??

Because the way floats are represented internally is different from ints, they are represented in 3 parts as sign , mantissa and exponent (IEEE 754) whereas representation of integer is much simple and so are operations like addition and subtraction on integers, even multiplication and division are performed using a combination of addition,subtraction and shift operations internally . since integer addition and subtraction are simple, so are their division/multiplications and hence floating point operations are some what expensive

Why Floating point modulo is expensive than Integer?

The answer is same as above , A modulo operation is nothing but combination of primitive operations mentioned above as follows:

a mod n = a - (n*int(a/n))

Since primitive operations for floats are more expensive, so is modulo for floats

Why is the modulo operation more expensive on float than on int? — Sarat Adusumilli, Jun 22 '15 at 07:55
@SaratAdusumilli , I've updated the answer stating why modulo operation on float is more expensive , check it out :) — Pruthvi Raj, Jun 22 '15 at 09:16

score 8 · Answer 2 · answered Jun 22 '15 at 07:30

8

In the explicit case you're not taking the int of the expression before calling factors and therefore the value passed will be a floating-point number.

In the generator case you're instead yielding int(...), calling factors passing an integer number.

answered Jun 22 '15 at 07:30

6502

112,025
15
165
265

1

Timed it, removing the cast from the generator makes the two implementations run in almost identical time. – samgak Jun 22 '15 at 07:32
@6502 - it would be great if you could explain why operations on float are more expensive from operations on int. – matino Jun 22 '15 at 07:40
@SaratAdusumilli it's most likely the modulo operation that occurs inside the loop and not the sqrt that is taking most of the extra time in the case of floats – samgak Jun 22 '15 at 07:42
@samgak You are right it is actually the modulo operation on float which is more expensive. But when it comes the sqrt float is actually faster than the int. – Sarat Adusumilli Jun 22 '15 at 07:54
@6502 Why is the modulo operation more expensive on float than on int? – Sarat Adusumilli Jun 22 '15 at 07:55
1

@SaratAdusumilli: In Python the modulo operator on floats provides a functionality that is not present in C. See http://stackoverflow.com/a/14763891/320726 – 6502 Jun 22 '15 at 07:59
@6502 Got it. Thanks for your answer. – Sarat Adusumilli Jun 22 '15 at 08:11

LittleQ · Answer 3 · 2015-06-22T07:23:53.310

3

You can remove factors() of the code, make 500 much larger.

# implementation 1
i = 1
while True:
    if (i * (i + 1) * 0.5) > n: # n=100000
        # print int(i * (i + 1) * 0.5),
        break
    i += 1

and %timeit, to compare with implementation 2:

def triangle():
    i = 1
    while True:
        yield i * (i + 1) * 0.5
        i += 1

t = triangle()

while True:
    num = t.next()
    if num > n:
        # print num,
        break

edited Jun 22 '15 at 07:23

answered Jun 22 '15 at 07:16

LittleQ

1,860
1
12
14

I don't think the generator function is what really matter – LittleQ Jun 22 '15 at 07:32

score 2 · Answer 4 · answered Jun 22 '15 at 07:10

2

The only difference as far as I know is that you do the calculation i * (i+1) * 0.5 twice in the first example. It is not a expensive computation, but it might make a big difference since it is such a big portion of the program..

answered Jun 22 '15 at 07:10

Binnut

90
7

The calculation occurs twice only in one case and it doesn't make much of a difference to the performance. – Sarat Adusumilli Jun 22 '15 at 07:15
True, I did not realize that there was a break there. And I did not take a close enough look at factors. My bad. – Binnut Jun 23 '15 at 09:29

Why is using a generator function twice as fast in this case?

The code which is common for both the implementations:

1. Implementation which doesn't make use of a generator function:

2. Implementation which makes use of a generator function:

The Question:

4 Answers4