11

This is mostly to make sure my methodology is correct, but my basic question was is it worth it to check outside of a function if I need to access the function at all. I know, I know, premature optimization, but in many cases, its the difference between putting an if statement inside the function call to determine whether I need to run the rest of the code, or putting it before the function call. In other words, it takes no effort to do it one way or the other. Right now, all the checks are mixed between both, and I'd like the get it all nice and standardized.

The main reason I asked is because the other answers I saw mostly referenced timeit, but that gave me negative numbers, so I switched to this:

import timeit
import cProfile

def aaaa(idd):
    return idd

def main():
    #start = timeit.timeit()
    for i in range(9999999):
        a = 5
    #end = timeit.timeit()
    #print("1", end - start)

def main2():
    #start = timeit.timeit()
    for i in range(9999999):
        aaaa(5)
    #end = timeit.timeit()
    #print("2", end - start)

cProfile.run('main()', sort='cumulative')
cProfile.run('main2()', sort='cumulative')

and got this for output

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.310    0.310 {built-in method exec}
        1    0.000    0.000    0.310    0.310 <string>:1(<module>)
        1    0.310    0.310    0.310    0.310 test.py:7(main)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.044    2.044 {built-in method exec}
        1    0.000    0.000    2.044    2.044 <string>:1(<module>)
        1    1.522    1.522    2.044    2.044 test.py:14(main2)
  9999999    0.521    0.000    0.521    0.000 test.py:4(aaaa)

To me that shows that not calling the function is .31 seconds, and calling it takes 1.52 seconds, which is almost 5 times slower. But like I said, I got negative numbers with timeit, so I want to make sure its actually that slow.

Also from what I gather, the reason function calls are so slow is because python needs to look up to make sure the function still exists before it can run it or something? Isn't there any way to just tell it to like...assume that everything is still there so that it doesn't have to do unnecessary work that (apparently) slows it down 5x?

Martin
  • 830
  • 1
  • 7
  • 24
DanielCardin
  • 545
  • 2
  • 8
  • 17
  • 2
    You are comparing apples and pears here. You are comparing assignment to a local variable to invoking a function, two very different things. – Martijn Pieters Feb 01 '13 at 14:29
  • 1
    And how did you call `timeit` to get negative numbers? – Martijn Pieters Feb 01 '13 at 14:30
  • 1
    I'd say 5x slower is actually *fast*. I mean, calling a function is **much** more work then a simple assignment. You have to 1) look if the function exist(that's a global lookup which is slow) 2) Pack the arguments, 3) Call the function which itself has to unpack the arguments 4) execute the code 5) Return the object. (The fact that I used *5* steps is just a chance. I don't think it has anything to do with the code being 5 times slower) – Bakuriu Feb 01 '13 at 14:32
  • Calling `timeit.timeit()` is measuring the time it takes to execute nothing a million times. Subtracting `timeit.timeit()` from `timeit.timeit()` may be negative. – unutbu Feb 01 '13 at 14:34
  • Martijn: That was the point (unless I did it wrong) to compare doing something in a function vs doing it outside a function. I essentially want to know what the added overhead of the function call itself is. and I commented out my timeit method to show what I was doing. – DanielCardin Feb 01 '13 at 14:34
  • By the way: slow with respect to what? C, Java? Obviously in python you cannot pretend to bind function calls to the functions at compile time, since at runtime you can reassign them. It's simply impossible to obtain comparable timings with compiled languages, because compiled languages does this at compile time, loosing the ability to redefine things at runtime. Also, if the function does any minimal work, the overhead of the call will be absolutely nothing with respect to the total time spent in the computation. – Bakuriu Feb 01 '13 at 14:37

1 Answers1

43

You are comparing apples and pears here. One method does simple assignment, the other calls a function. Yes, function calls will add overhead.

You should strip this down to the bare minimum for timeit:

>>> import timeit
>>> timeit.timeit('a = 5')
0.03456282615661621
>>> timeit.timeit('foo()', 'def foo(): a = 5')
0.14389896392822266

Now all we did was add a function call (foo does the same thing), so you can measure the extra time a function call takes. You cannot state that this is nearly 4 times slower, no, the function call adds a 0.11 second overhead for 1.000.000 iterations.

If instead of a = 5 we do something that takes 0.5 seconds to execute one million iterations, moving them to a function won't make things take 2 seconds. It'll now take 0.61 seconds because the function overhead doesn't grow.

A function call needs to manipulate the stack, pushing the local frame onto it, creating a new frame, then clear it all up again when the function returns.

In other words, moving statements to a function adds a small overhead, and the more statements you move to that function, the smaller the overhead becomes as a percentage of the total work done. A function never makes those statements themselves slower.

A Python function is just an object stored in a variable; you can assign functions to a different variable, replace them with something completely different, or delete them at any time. When you invoke a function, you first reference the name by which they are stored (foo) and then invoke the function object ((arguments)); that lookup has to happen every single time in a dynamic language.

You can see this in the bytecode generated for a function:

>>> def foo():
...     pass
... 
>>> def bar():
...     return foo()
... 
>>> import dis
>>> dis.dis(bar)
  2           0 LOAD_GLOBAL              0 (foo)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        

The LOAD_GLOBAL opcode looks up the name (foo) in the global namespace (basically a hash table lookup), and pushes the result onto the stack. CALL_FUNCTION then invokes whatever is on the stack, replacing it with the return value. RETURN_VALUE returns from a function call, again taking whatever is topmost on the stack as the return value.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Though it can *appear* to make them slower, such as in a case where you are comparing assigning a new value to an existing variable over and over again vs creating a new variable in function scope and assigning to it over and over again. Yes under the hood those aren't doing the same thing, but from a naive analysis of just the code, they look the same. – Silas Ray Feb 01 '13 at 14:46
  • 1
    doesn't the manipulation of the stack and etc happen in all languages? That cost will happen regardless of whether or not you're in python, correct? I was talking more about how I was under the impression that python has to do extra work because python is dynamic and functions can be created and disappear. Which lead to my later question of whether or not you can force python to assume that it is still there and not have to look it up every time (I really don't actually know much about this, so please inform me if I'm making things up!) – DanielCardin Feb 01 '13 at 14:49
  • 3
    @DanielCardin: Functions are just objects stored in variables too; they are looked up every time you access their name. You cannot work around that; that's the nature of a dynamic language. – Martijn Pieters Feb 01 '13 at 15:36