14

Referring to this Python List Comprehension Vs. Map question, can someone explain why List Comprehensions gives better results over map when list comprehension does not call a function, even when there is no lambda function in the map but gives the worst result when calling a function?

import timeit

print timeit.Timer('''[i**2 for i in xrange(100)]''').timeit(number = 100000)

print timeit.Timer('''map(lambda i: i**2, xrange(100))''').timeit(number = 100000)

print timeit.Timer(setup="""def my_pow(i):
    return i**2
""",stmt="""map(my_pow, xrange(100))""").timeit(number = 100000)

print timeit.Timer(setup="""def my_pow(i):
    return i**2
""",stmt='''[my_pow(i) for i in xrange(100)]''').timeit(number = 100000)

results:

1.03697046805 <-- list comprehension without function call
1.96599485313 <-- map with lambda function
1.92951520483 <-- map with function call
2.23419570042 <-- list comprehension with function call
Community
  • 1
  • 1
zenpoy
  • 19,490
  • 9
  • 60
  • 87
  • It doesn't matter whether the function called in `map` is a lambda or a regular function, the overhead is still there. No idea why a list comprehension *with* a function call would be slower than `map()` though. – millimoose Jul 23 '12 at 16:31
  • @millimoose but the lambda function gets declared for each itetaration, does this make any change? – zenpoy Jul 23 '12 at 16:34
  • 4
    @zenpoy: Function call arguments are evaluated before the function is called, so the function is declared only once. – Sven Marnach Jul 23 '12 at 16:38
  • cannot delete comments on my phine :( – Joel Cornett Jul 23 '12 at 16:43
  • 1
    @SvenMarnach i think he/she's talking about `my_pow` definition being interpreted only once for the whole timeit execution (in setup) and lambda being defined for each iteration. It's a valid question, and lambda probably contributes to it's version being slightly slower. – soulcheck Jul 23 '12 at 16:49
  • @soulcheck Well, it's easy to just do `Timeit('lambda i: i**2')…` and compare. On my machine, it amounts to 0.67% of the time the map takes to run, which indicates that parsing that little code is negligible overhead compared to the rest. – millimoose Jul 23 '12 at 17:41
  • @millimoose well one could say that difference between map-with-lambda and map-with-function-call is negligible in the OP's question too. – soulcheck Jul 23 '12 at 18:20

1 Answers1

15

All your timing results can be explained by theses facts:

  1. CPython has a rather high function call overhead.

  2. map(f, it) is slightly faster than [f(x) for x in it].

The first version of your code does not define a function at all, so there is no function call overhead. The second version needs to define a function, ,so there is function call overhead in every iteration. The third version is completely equivalent to the second one – functions and lambda expressions are the same thing. And the last version is even slower according to fact 2.

Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • 3
    I can think of two possible reasons for fact (2): `map`'s loop takes place directly in C and so has slightly less overhead in the looping, and the reference to name `f` only needs to be resolved once for `map` but on each iteration for the list comprehension. The second feels more important to me, but any way to tell which one contributes more? – Danica Jul 23 '12 at 16:46
  • @Dougal: This is not the first time someone mention the loop taking place in C, but I never found a reference or something to substain that it should be faster. – Rik Poggi Jul 23 '12 at 17:11
  • @RikPoggi I tried checking the Python sources, and `map` in 3.x actually returns an iterator, and works on using the arguments' iterators, which makes that theory pretty unlikely. There might still be some savings from all of these being C-based, but there is no mythical "loop in C". It might be interesting (if not conclusive of anything) to compare the performance of 2.x and 3.x in this regard actually. – millimoose Jul 23 '12 at 17:51
  • @millimoose: Things like "the loop is entirely done in C" usually mean that no Python byte-code evaluation is necessary to perform the loop. – Sven Marnach Jul 23 '12 at 19:29
  • @SvenMarnach Fair enough, but would that distinction make a significant difference? The runtime still has to deal with all the overhead of manipulating the interpreter state; it doesn't seem to me that bytecode dispatch in and of itself would be the main contributor to the slowdown. – millimoose Jul 23 '12 at 20:45
  • @millimoose: It's hard to tell if this is a meaningful distinction. I don't think there are any general rules here. When performance really matters, you have to measure anyway in the end. – Sven Marnach Jul 23 '12 at 20:50
  • @SvenMarnach What I'm trying to say is more that it's a misleading phrasing that leads people to imagine what really goes on is that there's some piece of C code that goes more or less `for (int i = 0; i < len(it); i++) { f(it[i]); }` when that's not really the case. It'd be more accurate to say that what's fast is iterating over collections that have iterators implemented in C because you get to cut through a bunch of overhead that's not necessarily related to looping. (Like the function call you mention in your answer and name lookup that Dougal mentioned.) – millimoose Jul 23 '12 at 21:24
  • @millimoose: Well, I certainly don't have such misconceptions, and I think I agree with everything you said. I just wanted to clarify what people usually mean when saying "this loop is entirely in C code". (By "people" I mean people familiar with the CPython source code.) – Sven Marnach Jul 23 '12 at 22:09