2

Trying to understand this phenomenon in Python. When I wrap my code into a function, it speeds up by more than 30%!!! So far I failed to google out any reasonable explanation.

For example:

import sys
sys.stdin = open("dd.txt")
import cProfile
import time
t0 = time.time()


def main():
    from collections import defaultdict
    n, T = map(int, raw_input().split())
    tree = defaultdict(lambda: set())
    root = None
    for _ in xrange(n - 1):
        a, b = map(int, raw_input().split())
        tree[a].add(b)
        tree[b].add(a)
        if not root:
            root = a

    count = 0
    stack = [root]
    links = dict()
    links[root] = 0
    mem = dict()
    while stack:
        node = stack.pop()
        path = list()
        path.append(node)
        lnk = links[node]
        while lnk:
            if lnk not in mem:
                if abs(lnk - node) <= T:
                    count += 1
                path.append(lnk)
                lnk = links[lnk]
            else:
                path.extend(mem[lnk])
                for el in mem[lnk]:
                    if abs(el - node) <= T:
                        count += 1
                break
        #print node, path
        plen = len(path)
        mem[node] = path
        for next_node in tree[node]:
            if plen <= 1 or next_node != path[1]:
                links[next_node] = node
                stack.append(next_node)

    print count
main()

print time.time() - t0

This prints running time as 2.5 seconds, but this:

import sys
sys.stdin = open("dd.txt")
import cProfile
import time
t0 = time.time()


#def main():
from collections import defaultdict
n, T = map(int, raw_input().split())
tree = defaultdict(lambda: set())
root = None
for _ in xrange(n - 1):
    a, b = map(int, raw_input().split())
    tree[a].add(b)
    tree[b].add(a)
    if not root:
        root = a

count = 0
stack = [root]
links = dict()
links[root] = 0
mem = dict()
while stack:
    node = stack.pop()
    path = list()
    path.append(node)
    lnk = links[node]
    while lnk:
        if lnk not in mem:
            if abs(lnk - node) <= T:
                count += 1
            path.append(lnk)
            lnk = links[lnk]
        else:
            path.extend(mem[lnk])
            for el in mem[lnk]:
                if abs(el - node) <= T:
                    count += 1
            break
    #print node, path
    plen = len(path)
    mem[node] = path
    for next_node in tree[node]:
        if plen <= 1 or next_node != path[1]:
            links[next_node] = node
            stack.append(next_node)

print count
#main()

print time.time() - t0

Simply moving code out of main() function makes it run 3.5 seconds instead of 2.5.

What can be the reason for this???

jww
  • 97,681
  • 90
  • 411
  • 885
ssanin82
  • 23
  • 4
  • 3
    Maybe you typed for a second longer on your raw_input()? – Rob Feb 02 '15 at 08:06
  • Were these figures of 2.5 and 3.5 seconds averages of several trials? If not, then there's no indication of whether the code is the culprit or if, eg, the CPU was pegged in the middle of running the second block of code. If so, how many trials, and what was the standard deviation for each code block? –  Feb 02 '15 at 08:08
  • 1
    This is not a proper way to benchmark (you should run the same code in thousands of loops and calc the average, for once). Use a benchmark framework! – Nir Alfasi Feb 02 '15 at 08:12
  • raw_input() reads from a file, I do not type anything. Also, the reason I put everything in main() is I wanted to run cProfile.run("main()") on it - hence "import cProfile" above. – ssanin82 Feb 02 '15 at 10:46

1 Answers1

6

The difference is that Python uses different bytecode operations for accessing the local variables of a function and the global variables of a module. The LOAD_FAST opcodes used for accessing local variables takes a numeric index and performs a quick array lookup to retrieve the value. The LOAD_NAME and LOAD_GLOBAL opcodes used for accessing global variables take a name and perform a hash table lookup (possibly in multiple hash tables) to retrieve the value.

By wrapping your code in a function, you're effectively converting all of your variables from globals into locals, which enables much faster access to them.

yole
  • 92,896
  • 20
  • 260
  • 197