Why does "from [Module] import [Something]" takes more time than "import [Module"

Question

I used python -mtimeit to test and found out it takes more time to from Module import Sth comparing to import Module

E.g.

$ python -mtimeit "import math; math.sqrt(4)"
1000000 loops, best of 3: 0.618 usec per loop
$ python -mtimeit "from math import sqrt; sqrt(4)"
1000000 loops, best of 3: 1.11 usec per loop

same for other case. Could someone please explain the rationale behind? Thank you!

Possibly take a look at this post - http://stackoverflow.com/a/3592137/1679863 — Rohit Jain, Aug 09 '13 at 19:10
I suggest not worrying about why this happens (past intellectual curiosity) and use whatever makes your code most readable. — roippi, Aug 09 '13 at 19:12
@RohitJain: I like the hypothesis. However, `python -mtimeit "from math import pow, sqrt, sin, fabs, ceil, floor, fmod"` only takes a fraction more than `python -mtimeit "from math import pow"`, which doesn't support the theory. — NPE, Aug 09 '13 at 19:13

abarnert · Answer 1 · 2013-08-09T19:36:28.183

There are two issues here. The first step is to figure out which part is faster: the import statement, or the call.

So, let's do that:

$ python -mtimeit 'import math'
1000000 loops, best of 3: 0.555 usec per loop
$ python -mtimeit 'from math import sqrt'
1000000 loops, best of 3: 1.22 usec per loop
$ python -mtimeit -s 'from math import sqrt' 'sqrt(10)'
10000000 loops, best of 3: 0.0879 usec per loop
$ python -mtimeit -s 'import math' 'math.sqrt(10)'
10000000 loops, best of 3: 0.122 usec per loop

(That's with Apple CPython 2.7.2 64-bit on OS X 10.6.4 on my laptop. But python.org 3.4 dev on the same laptop and 3.3.1 on a linux box give roughly similar results. With PyPy, the smarter caching makes it impossible to test, since everything finishes in 1ns… Anyway, I think these results are probably about as portable as microbenchmarks ever can be.)

So it turns out that the import statement is more than twice as fast; after that, calling the function is a little slower, but not nearly enough to make up for the cheaper import. (Keep in mind that your test was doing an import for each call. In real-life code, of course, you tend to call things a lot more than once per import. So, we're really looking at an edge case that will rarely affect real code. But as long as you keep that in mind, we proceed.)

Conceptually, you can understand why the from … import statement takes longer: it has more work to do. The first version has to find the module, compile it if necessary, and execute it. The second version has to do all of that, and then also extract sqrt and insert it into your current module's globals. So, it has to be at least a little slower.

If you look at the bytecode (e.g., by using the dis module and calling dis.dis('import math')), this is exactly the difference. Compare:

  0 LOAD_CONST               0 (0) 
  3 LOAD_CONST               1 (None) 
  6 IMPORT_NAME              0 (math) 
  9 STORE_NAME               0 (math) 
 12 LOAD_CONST               1 (None) 
 15 RETURN_VALUE

… to:

  0 LOAD_CONST               0 (0) 
  3 LOAD_CONST               1 (('sqrt',)) 
  6 IMPORT_NAME              0 (math) 
  9 IMPORT_FROM              1 (sqrt) 
 12 STORE_NAME               1 (sqrt) 
 15 POP_TOP              
 16 LOAD_CONST               2 (None) 
 19 RETURN_VALUE

The extra stack manipulation (the LOAD_CONST and POP_TOP) probably doesn't make much difference, and using a different argument to STORE_NAME is unlikely to matter at all… but the IMPORT_FROM is a significant extra step.

Surprisingly, a quick&dirty attempt to profile the IMPORT_FROM code shows that the majority of the cost is actually looking up the appropriate globals to import into. I'm not sure why, but… that implies that importing a whole slew of names should be not much slower than importing just one. And, as you pointed out in a comment, that's exactly what you see. (But don't read too much into that. There are many reasons that IMPORT_FROM might have a large constant factor and only a small linear one, and we're not exactly throwing a huge number of names at it.)

One last thing: If this ever really does matter in real code, and you want to get the best of both worlds, import math; sqrt = math.sqrt is faster than from math import sqrt, but gives you the same small speedup to lookup/call time. (But again, I can't imagine any real code where this would matter. The only time you'll ever care how long sqrt takes is when you're calling it a billion times, at which point you won't care how long the import takes. Plus, if you really do need to optimize that, create a local scope and bind sqrt there to avoid the global lookup entirely.)

For people like me, new to python bytecode, how did you get the instruction list? — seanmcl, Aug 09 '13 at 19:23
@SeanMcLaughlin: See the `dis` module. I'll update the answer with more details. — abarnert, Aug 09 '13 at 19:29

score 0 · Answer 2 · answered Aug 09 '13 at 19:16

0

This is not an answer, but some information. It needed formatting so I didn't include it as a comment. Here is the bytecode for 'from math import sqrt':

>>> from math import sqrt
>>> import dis
>>> def f(n): return sqrt(n)
... 
>>> dis.dis(f)
  1           0 LOAD_GLOBAL              0 (sqrt)
              3 LOAD_FAST                0 (n)
              6 CALL_FUNCTION            1
              9 RETURN_VALUE

And for 'import math'

>>> import math
>>> import dis
>>> dis.dis(math.sqrt)
>>> def f(n): return math.sqrt(n)
... 
>>> dis.dis(f)
  1           0 LOAD_GLOBAL              0 (math)
              3 LOAD_ATTR                1 (sqrt)
              6 LOAD_FAST                0 (n)
              9 CALL_FUNCTION            1
             12 RETURN_VALUE

Interestingly, the faster method has one more instruction.

answered Aug 09 '13 at 19:16

seanmcl

9,740
3
39
45

Is the method actually faster? I'm pretty sure that the _import statement_ is actually what's taking twice as long, not the function call. – abarnert Aug 09 '13 at 19:19
In fact, from a quick test, the actually function is actually _slower_ using `math.sqrt`, 0.122us vs. 0.0879us. But the `import` statement is much faster using `import math`, 0.555us vs. 1.222us. So, this information is irrelevant. – abarnert Aug 09 '13 at 19:23
How did you separate the run time from the load time? – seanmcl Aug 09 '13 at 19:24
1

I explained that in my answer. To get just the import times, I called `timeit` with just the import statements. To get just the function call times, I passed the import statements as setup code instead of as part of the main code (`-s` from the command-line interface). – abarnert Aug 09 '13 at 19:31

Why does "from [Module] import [Something]" takes more time than "import [Module"

2 Answers2