6

Testing alternatives to for _ in range(n) (to execute some action n times, even if the action does not depend on the value of n) I noticed that there is another formulation of this pattern that is faster, for _ in [""] * n.

For example:

timeit('for _ in range(10^1000): pass', number=1000000)

returns 16.4 seconds;

whereas,

timeit('for _ in [""]*(10^1000): pass', number=1000000)

takes 10.7 seconds.

Why is [""] * 10^1000 so much faster than range(10^1000) in Python 3?

All testing done using Python 3.3

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Dunedubby
  • 137
  • 1
  • 6
  • The latter is going to cause you problems. https://stackoverflow.com/questions/1605024/python-using-the-multiply-operator-to-create-copies-of-objects-in-lists – Cory Kramer May 22 '15 at 14:58
  • Are you actually using `x` or do you just need something to happen `n` times? – kylieCatt May 22 '15 at 14:59
  • 5
    Incidentally, it's idiomatic to use an underscore when you don't actually care about the value you're iterating: `for _ in range(n)` – Kevin May 22 '15 at 15:04
  • 5
    For what it's worth the `timeit('for x in range(10): pass', number=10000000)` results are `5.320733592294609` and for `timeit('for x in [0]*10: pass',number=10000000)` are `4.120525842738559` – kylieCatt May 22 '15 at 15:11
  • 1
    Let me make my point here: ``[x] * 5`` **does not** create a list containing 5 items of x but rather 5 copies of x. *By careful*. – James Mills May 22 '15 at 15:15
  • @JamesMills If you're thinking of lists or so, it's the other way around. It does *not* make copies. Also, "x" in his case is 0, so it's fine. – Stefan Pochmann May 22 '15 at 15:17
  • @IanAuld I just want something to happen n times. – Dunedubby May 22 '15 at 15:17
  • @StefanPochmann Yeah not "copies" but n references to teh same object. – James Mills May 22 '15 at 15:18
  • @IanAuld That's very interesting that it's faster, even in Python 3... I suppose my question is why don't people use it? – Dunedubby May 22 '15 at 15:22
  • @MartijnPieters I've edited the question. Is there any way to improve it so it might be reopened? Thanks! – Dunedubby May 22 '15 at 15:28
  • 1
    For me on python2.7 `timeit('for _ in xrange(100): pass', number=10000000)` gives `14.645585775375366`. `timeit('for _ in range(100): pass', number=10000000)` gives `17.79997205734253` and `timeit('for _ in [0]*100: pass',number=10000000)` gives `16.930885076522827`. Remember to use xrange in python2 – camz May 22 '15 at 15:32
  • @camz Hmm that's odd... on Python 3.3 `[0]*n` runs faster than `range`, which I understand to be `xrange` in Python 2 – Dunedubby May 22 '15 at 15:38
  • @Dunedubby - asking *why* `[0] * n` runs faster in Python 3 is (in my opinion) a good, focused question. Asking "what reasons (stylistic, speed, day-of-the-week) might one have to use `range` instead of `[0] * n`" is a bit too open-ended and opinion-based (again, in my opinion). – Sean Vieira May 22 '15 at 15:41
  • I noticed `['']*n ` runs super duper fast. – Dunedubby May 22 '15 at 15:47
  • @camz I'm really curious as to how `['']*n` compares on Python 2. – Dunedubby May 22 '15 at 15:48
  • 1
    `timeit('for _ in [0]*100: pass',number=1000000)` gives `1.6880948543548584` and `timeit('for _ in [None]*100: pass',number=1000000)` gives `1.6721088886260986` and `timeit('for _ in ['']*100: pass',number=1000000)` gives `0.13373517990112305` - wow! – camz May 22 '15 at 16:02
  • 2
    Another note: power operator in Python is `**` (`10**1000`); `^` is binary xor, `10^1000` yields `994`. – hamstergene May 22 '15 at 16:18
  • 2
    I'm just going to assume you didn't mean XOR when you used `10^1000` there (it equals `994`) and actually meant pow: `**` (which equals 1 with 1000 zeroes) – Navith May 22 '15 at 16:18
  • You still can’t use `10**1000`, though. It’s a little too big. – Ry- May 22 '15 at 16:22

2 Answers2

10

Your problem is that you are incorrectly feeding timeit.

You need to give timeit strings containing Python statements. If you do

stmt = 'for _ in ['']*100: pass'

Look at the value of stmt. The quote characters inside the square brackets match the string delimiters, so they are interpreted as string delimiters by Python. Since Python concatenates adjacent string literals, you'll see that what you really have is the same as 'for _ in [' + ']*100: pass', which gives you 'for _ in []*100: pass'.

So your "super-fast" loop is just looping over the empty list, not a list of 100 elements. Try your test with, for example,

stmt = 'for _ in [""]*100: pass'
John Y
  • 14,123
  • 2
  • 48
  • 72
  • It took me a few reads to understand that the actual problem was the quotes. Could you clarify that? Thanks! – Paco May 22 '15 at 16:18
  • At the risk of making this useful/good answer not make sense, maybe the question should be changed back, or amended to include both versions. The original question isnt really being answered any more, Why is `[0]*100` (or anything similar) faster than `range(100)`? – camz May 22 '15 at 16:25
  • @Dunedubby A small speed difference is reasonable as `range` has to create `n` `int` objects, whereas `iter([None] * n)` can loop using C-side integers. – Veedrac May 22 '15 at 16:25
  • @camz: because, as Veedrac states, no new integer objects have to be created (`range()` produces integers on demand). – Martijn Pieters May 22 '15 at 16:27
10

When iterating over range(), objects for all integers between 0 and n are produced; this takes a (small) amount of time, even with small integers having been cached.

The loop over [None] * n on the other hand produces n references to 1 object, and creating that list is a little faster.

However, the range() object uses far less memory, and is more readable to boot, which is why people prefer using that. Most code doesn't have to squeeze every last drop from the performance.

If you need to have that speed, you can use a custom iterable that takes no memory, using itertools.repeat() with a second argument:

from itertools import repeat

for _ in repeat(None, n):

As for your timing tests, there are some problems with those.

First of all, you made an error in your ['']*n timing loop; you did not embed two quotes, you concatenated two strings and produced an empty list:

>>> '['']*n'
'[]*n'
>>> []*100
[]

That's going to be unbeatable in an iteration, as you iterated 0 times.

You also didn't use large numbers; ^ is the binary XOR operator, not the power operator:

>>> 10^1000
994

which means your test missed out on how long it'll take to create a large list of empty values.

Using better numbers and None gives you:

>>> from timeit import timeit
>>> 10 ** 6
1000000
>>> timeit("for _ in range(10 ** 6): pass", number=100)
3.0651066239806823
>>> timeit("for _ in [None] * (10 ** 6): pass", number=100)
1.9346517859958112
>>> timeit("for _ in repeat(None, 10 ** 6): pass", 'from itertools import repeat', number=100)
1.4315521717071533
Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343