25

Consider the following functions:

def fact1(n):
    if n < 2:
        return 1
    else:
        return n * fact1(n-1)

def fact2(n):
    if n < 2:
        return 1
    return n * fact2(n-1)

They should be equivalent. But there's a performance difference:

>>> T(lambda : fact1(1)).repeat(number=10000000)
[2.5754408836364746, 2.5710129737854004, 2.5678811073303223]
>>> T(lambda : fact2(1)).repeat(number=10000000)
[2.8432059288024902, 2.834425926208496, 2.8364310264587402]

The version without the else is 10% slower. This is pretty significant. Why?

Aillyn
  • 23,354
  • 24
  • 59
  • 84
  • Over how many tests did you do? – Martin Risell Lilja Nov 20 '11 at 18:38
  • It's not significant, unless it's a part of bigger whole and is actually identified as a bottleneck (yeah, right). 0.3 seconds difference is nothing by itself. – Cat Plus Plus Nov 20 '11 at 18:40
  • 4
    @M4tt4n erm... `.repeat(number=10000000)`. @Cat Plus Plus yeah, but finding out why things work the way they do is fun, right? – Gabi Purcaru Nov 20 '11 at 18:40
  • 4
    @GabiPurcaru: Not if it leads to focusing on micro-optimisations. It's unhealthy, and a complete waste of time. I really have no idea why people upvote questions like this. – Cat Plus Plus Nov 20 '11 at 18:43
  • @GabiPurcaru Haha, I'm not good with Python :P – Martin Risell Lilja Nov 20 '11 at 18:43
  • 13
    @GabiPurcaru I upvote these kind of questions because they reward people who truly want to understand what their code actually does. People who investigate timing differences and code generation often end-up with a deep understanding of the language. – Raymond Hettinger Nov 20 '11 at 19:33
  • +1 to @RaymondHettinger. Even after 2 decades of coding & scripting, I still enjoy mucking around, esp with new languages. Call me silly but sometimes we learn more from a child (ie: new programer in this case). – Alvin K. Nov 21 '11 at 00:22
  • 2
    @CatPlusPlus I really have no idea why people downvote questions like this. – Aillyn Nov 22 '11 at 17:31
  • @pessimopoppotamus: Because they have dubious value at best (the "not useful" part of the vote description). – Cat Plus Plus Nov 22 '11 at 17:33
  • possible duplicate of [Why is early return slower than else?](http://stackoverflow.com/questions/8271139/why-is-early-return-slower-than-else) – Duncan Nov 28 '11 at 08:59

3 Answers3

12

What is happening here is that fact2 has a hash conflict with __name__ in your module globals. That makes the lookup of the global fact2 ever so slightly slower.

>>> [(k, hash(k) % 32) for k in globals().keys() ]
[('__builtins__', 8), ('__package__', 15), ('fact2', 25), ('__name__', 25), ('fact1', 26), ('__doc__', 29)]

i.e. The same answer as for Why is early return slower than else? except that there the hash conflict was with __builtins__

Community
  • 1
  • 1
Duncan
  • 92,073
  • 11
  • 122
  • 156
11

For me, they are virtually the same speed: (Python 2.6.6 on Debian)

In [4]: %timeit fact1(1)
10000000 loops, best of 3: 151 ns per loop

In [5]: %timeit fact2(1)
10000000 loops, best of 3: 154 ns per loop

The byte code is also very similar:

In [6]: dis.dis(fact1)
  2           0 LOAD_FAST                0 (n)
              3 LOAD_CONST               1 (2)
              6 COMPARE_OP               0 (<)
              9 JUMP_IF_FALSE            5 (to 17)
             12 POP_TOP             

  3          13 LOAD_CONST               2 (1)
             16 RETURN_VALUE        
        >>   17 POP_TOP             

  5          18 LOAD_FAST                0 (n)
             21 LOAD_GLOBAL              0 (fact)
             24 LOAD_FAST                0 (n)
             27 LOAD_CONST               2 (1)
             30 BINARY_SUBTRACT     
             31 CALL_FUNCTION            1
             34 BINARY_MULTIPLY     
             35 RETURN_VALUE        
             36 LOAD_CONST               0 (None)
             39 RETURN_VALUE        

In [7]: dis.dis(fact2)
  2           0 LOAD_FAST                0 (n)
              3 LOAD_CONST               1 (2)
              6 COMPARE_OP               0 (<)
              9 JUMP_IF_FALSE            5 (to 17)
             12 POP_TOP             

  3          13 LOAD_CONST               2 (1)
             16 RETURN_VALUE        
        >>   17 POP_TOP             

  4          18 LOAD_FAST                0 (n)
             21 LOAD_GLOBAL              0 (fact)
             24 LOAD_FAST                0 (n)
             27 LOAD_CONST               2 (1)
             30 BINARY_SUBTRACT     
             31 CALL_FUNCTION            1
             34 BINARY_MULTIPLY     
             35 RETURN_VALUE        

The only difference is that the version with the else includes code to return None in case control reaches the end of the function body.

Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
9

I question the timings. The two functions aren't recursing to themselves. fact1 and fact2 both call fact which isn't shown.

Once that is fixed, the disassembly (in both Py2.6 and Py2.7) shows that both are running the same op codes except for the name of the recursed into function. The choice of name trigger a small difference in timings because fact1 may insert in the module dictionary with no name collisions while *fact2) may have a hash value that collides with another name in the module.

In other words, any differences you see in timings are not due to the choice of whether the else-clause is present :-)

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
  • 3
    Not only that, isn't the argument 1 in every case? so it's just returning 1? – Mike Dunlavey Nov 20 '11 at 21:31
  • 1
    @pessimopoppotamus - Yes, a copy&paste error from the code I posted for you! :) It would have been gracious to inform me by adding a comment to [my answer](http://stackoverflow.com/q/8202827/146792), so that I didn't ask the same thing today! :) – mac Nov 25 '11 at 21:30