3

So I was reading through the itertools documentation when I came across this note at the bottom of the page:

Note, many of the above recipes can be optimized by replacing global lookups with local variables defined as default values. For example, the dotproduct recipe can be written as:

def dotproduct(vec1, vec2, sum=sum, imap=imap, mul=operator.mul):  
    return sum(imap(mul, vec1, vec2))

What do they mean by this? Am I to understand that adding functions as input variables makes the program run faster or more efficiently?

Could someone explain this to me like a 5 year old? :-) Thanks in advance.

arshajii
  • 127,459
  • 24
  • 238
  • 287
Noob Saibot
  • 4,573
  • 10
  • 36
  • 60
  • 1
    Like a 5 year old eh.. ok, if I say I am your friend Jason Sperske, then every time you need a friend for something you can just ask me. If instead I say you need you ask "Jason, from the Sperske family who can help you", you would have to go first to my family and ask for me then to me to ask for help. Assignments make all of the extra searching unnecessary :) By the way I love the Mortal Kombat reference in your screen name – Jason Sperske Jun 28 '13 at 20:15
  • Ah...it looks like this is a duplicate (even though my title is better...hehe). Should i close this? – Noob Saibot Jun 29 '13 at 02:04
  • @JasonSperske: That was very simple to understand. Thanks. – Noob Saibot Jun 29 '13 at 02:04

3 Answers3

3

One problem for Python efficiency is that the language is completely dynamic. For example consider the simple loop

def myfunc():
    for i in range(10):
        foo(bar(i))

seems that the function foo will be called with the result of calling bar ten times. However the function bar can for example change what foo is and the code of foo can in turn change what bar is. Python is thus forced to check at each iteration what foo is pointing to and what bar is pointing to. This requires looking in the module globals and, if nothing is found there in the builtin (predefined) names. At each of the 10 iterations.

The very same happens with all global lookups (and for example it's not forbidden for you to even define a function named len thus "hiding" the standard function with that name).

When using a local variable instead things are simpler

 def myfunc():
     f = foo
     b = bar
     for i in range(10):
         f(b(i))

the reason is that f and b are local variables so getting the value to be able to make the calls is much simpler. Code outside of myfunc cannot change what f and b are pointing to.

So one trick to get some speed is to write things like

def myfunc(x, sin=math.sin):
    ...

so that when using sin you don't need to look up math first and then sin inside math.

It's a kind of micro-optimization that is however considered bad style unless you really found (measured) the speed to be a problem, the fix has been measured to give a reasonable gain and that the slowness is however not serious enough to require some more radical approach.

6502
  • 112,025
  • 15
  • 165
  • 265
  • Some things you describe are orthogonal to locals lookup vs. globals lookup. `bar` can also change what a local variable `foo` of the caller refers to - or even if it can't, CPython doesn't optimize on that assumption. The only speed advantage of locals over globals is that they are stored in an array rather than in a hash table. –  Jun 28 '13 at 20:25
  • @delnan: Local variables are not searched at all, not even in an array. The bytecode generated directly gets the index of the slot where the data is, no lookup is necessary. With globals instead things need to be searched by name, potentially in two dictionaries. – 6502 Jun 28 '13 at 20:40
  • 1
    "the bytecode [...] get the index of the slot" is precisely what I was talking about. It's measurably cheaper than one or two dict lookups, but it's not free and it doesn't solve the problem that `foo` and `bar` are repeatedly loaded from the locals array onto the stack (via *several* bytecode instructions) and checked for being callable. –  Jun 28 '13 at 20:42
1

There are few benefits:

  1. The default arguments are calculated when the function definition is parsed. So, when the function is actually called no such calculation is required.

  2. During a function call when python sees that a variable was not found in local scope it looks for it in the globals first, if still not found then it goes to the built-ins. Ultimately it'll raise the error that variable not found if it's not found anywhere.

So operator.mul will require three calls: first search operator in local scope, then in global scope. Now as it is found in global scope now search operator module for mul. Declaring it like mul=operator.mul in function header will reduce these number of searches to 1.

Look at Byte code:

The local variables will re going to be loaded using LOAD_FAST.

While other variables like operator.itemgetter, list and list require more number of calls.

>>> def dotproduct(vec1, vec2, sum=sum, imap=imap, mul=operator.mul):  
        sum(imap(mul, vec1, vec2))
        list([operator.itemgetter(1) for x in lis])
...     
>>> dis.dis(dotproduct)
  2           0 LOAD_FAST                2 (sum)
              3 LOAD_FAST                3 (imap)
              6 LOAD_FAST                4 (mul)
              9 LOAD_FAST                0 (vec1)
             12 LOAD_FAST                1 (vec2)
             15 CALL_FUNCTION            3
             18 CALL_FUNCTION            1
             21 POP_TOP             

  3          22 LOAD_GLOBAL              0 (list)
             25 BUILD_LIST               0
             28 LOAD_GLOBAL              1 (lis)
             31 GET_ITER            
        >>   32 FOR_ITER                21 (to 56)
             35 STORE_FAST               5 (x)
             38 LOAD_GLOBAL              2 (operator)
             41 LOAD_ATTR                3 (itemgetter)
             44 LOAD_CONST               1 (1)
             47 CALL_FUNCTION            1
             50 LIST_APPEND              2
             53 JUMP_ABSOLUTE           32
        >>   56 CALL_FUNCTION            1
             59 POP_TOP             
             60 LOAD_CONST               0 (None)
             63 RETURN_VALUE  
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • Local variable access is direct, there is no search at all. The disassembled code shows a name for locals but the name is not used when running the code. Local variables are distinguished from global variables at compile time, it's not true that Python need first to check the local scope, then the global scope... – 6502 Jun 28 '13 at 20:38
0

Let me try.

I think This mean that when python will try to find variables (functions or other objects) it will use LEGB rule

please check this question for complete description of LEGB

But this trick allows python to find variables in closest scope local one (where arguments are situated)

Community
  • 1
  • 1
oleg
  • 4,082
  • 16
  • 16