7

I'm trying to use python's dis library to experiment with & understand performance. Below is an experiment i tried, with the results.

import dis

def myfunc1(dictionary):
    t = tuple(dictionary.items())
    return t

def myfunc2(dictionary, func=tuple):
    t = func(dictionary.items())
    return t

>>> dis.dis(myfunc1)

  4           0 LOAD_GLOBAL              0 (tuple)
              3 LOAD_FAST                0 (dictionary)
              6 LOAD_ATTR                1 (items)
              9 CALL_FUNCTION            0
             12 CALL_FUNCTION            1
             15 STORE_FAST               1 (t)

  5          18 LOAD_FAST                1 (t)
             21 RETURN_VALUE 

>>> dis.dis(myfunc2)

  4           0 LOAD_FAST                1 (func)
              3 LOAD_FAST                0 (dictionary)
              6 LOAD_ATTR                0 (items)
              9 CALL_FUNCTION            0
             12 CALL_FUNCTION            1
             15 STORE_FAST               2 (t)

  5          18 LOAD_FAST                2 (t)
             21 RETURN_VALUE    

Now, i understand that...

  • the 4 & 5 on the far left are the line numbers
  • the column in the middle is the opcodes that are called by the machine
  • the column on the right are the objects (with opargs?)

...But what does this all mean in terms of performance? If i were trying to make a decision on which function to use, how would i use dis to compare the two?

Thanks in advance.

Community
  • 1
  • 1
Noob Saibot
  • 4,573
  • 10
  • 36
  • 60
  • 6
    Don't use `dis.dis` for this. Use `timeit`. `dis` is useful for comparing bytecode differences between different code samples, which may give you an idea if one programming construct compiles to vastly more complex bytecode than another one. For real performance questions, you should profile, measuring execution times. – Tim Pietzcker Oct 11 '13 at 16:19
  • 3
    In general you can't. Bytecodes can take wildly different time to execute. For example a call to a C function is a single bytecode but could take years to execute, while other situations you have tens of bytecodes but all taking few nanoseconds. To compare performance use a profiler. – Bakuriu Oct 11 '13 at 16:23
  • Thank you both. I'm not even know if an "answer" is necessary now. :-) – Noob Saibot Oct 11 '13 at 16:27
  • In this specific case, as the thing getting called is the same in both instances (and thus `CALL_FUNCTION` time won't matter), we can say the second function executes faster. The only difference between the bytecodes is that `myfunc2` uses `LOAD_FAST` where `1` uses `LOAD_GLOBAL`, as the thing being loaded is a function argument in `2`. As `LOAD_FAST` _is_ extremely fast because it just loads a slot of the function object, whereas `LOAD_GLOBAL` uses a rather complex lookup, `2` will be (very slightly) faster. This could also be done by making `func` a local variable instead of an argument. – l4mpi Oct 11 '13 at 17:57
  • @l4mpi I think your analysis is the kind of answer the OP was looking for. The only answer currently available tells the OP not to use `dis` for performance analysis, which is a good general advice for a beginner, but not a good answer to this question. You might want to repost your comment as an answer. – user4815162342 Oct 11 '13 at 18:01

1 Answers1

4

You (or at least regular people) can't look at different assembly codes, and tell which one is faster.

Try %%timeit magic function from IPython.

It will automatically run the piece of code several times, and give you an objective answer.

I recently found this blog post that teaches how to measure these kind of things in Python. Not only time, but memory usage too. The higlight of the post (for me, at least) it's when it teaches you to implement the %lprun magic function.

Using it, you will be able to see your function line by line, and know exactly how much each one contribute to the total time spent.

I've been using for a few weeks now, and it's great.

wwii
  • 23,232
  • 7
  • 37
  • 77
Lucas Ribeiro
  • 6,132
  • 2
  • 25
  • 28