Explanation of cProfiler Output

Question

Both of the functions below perform the same set of fundamental operations get the binary repr of an integer, strip the first 2 characters, pad with zeroes then perform a slice from the right.

However, loop1 takes twice as long to run as loop2. Any insights into why that might be the case would be greatly appreciated.

def loop1(wires):

    pad = '0'*wires

    def _trim(m, sl):
        return m[sl:]
    def _pad(m):
        return pad+m


    for n in xrange(2**wires - 1):
        m = bin(n)
        m = _trim(m, 2)
        m = _pad(m)
        m = _trim(m, -4)



def loop2(wires):
    pad = '0'*wires
    for n in xrange(2**wires - 1):
        m = bin(n)
        m = (pad+m[2:])[-4:]



cProfile.run('loop1(24)')
cProfile.run('loop2(24)')



         67108863 function calls in 22.005 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   11.669   11.669   22.005   22.005 <module1>:78(loop1)
 33554430    3.834    0.000    3.834    0.000 <module1>:82(_trim)
 16777215    1.992    0.000    1.992    0.000 <module1>:84(_pad)
        1    0.000    0.000   22.005   22.005 <string>:1(<module>)
 16777215    4.510    0.000    4.510    0.000 {bin}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


         16777218 function calls in 9.482 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    5.160    5.160    9.482    9.482 <module1>:96(loop2)
        1    0.000    0.000    9.482    9.482 <string>:1(<module>)
 16777215    4.322    0.000    4.322    0.000 {bin}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

score 1 · Answer 1 · edited May 23 '17 at 12:21

1

What's called tottime is self or exclusive time, while what's called cumtime is inclusive time.

Add up the tottime column, and you get about 46, which is the cumtime for _run_inner_loop.

Notice that _run_inner_loop spends a lot of self time, over 24. I suspect that is spent in the for statement (because that's all there is).

If you try this you can nail that down.

edited May 23 '17 at 12:21

Community

1
1

answered May 21 '15 at 12:40

Mike Dunlavey

40,059
14
91
135

Thanks for the suggestions after playing with the code for a while it seems to be the number of operations inside the loop that are driving the performance differences. – mwoods May 21 '15 at 17:39

Explanation of cProfiler Output

1 Answers1