Python: why is it faster to iterate over a list than iterating over an xrange() generator define on its length?

Question

Why is iterating over a list faster than iterating over an xrange generator defined on the list length?

Input:

timeit.timeit('for _ in dummy: continue', setup='dummy=xrange(10000)', number=100000)
timeit.timeit('for _ in dummy: continue', setup='dummy = [0] * 10000', number=100000)

Output:

22.131134033203125
17.916101932525635

I assume it depends on the ratio between how many of these operations are executed natively in pre-compiled C code.

Wouldn't `dummy=xrange(10000)` vs `dummy=range(10000)` be a fairer comparison? Otherwise you are creating an `xrange` object N times, but only creating your list once. — khelwood, Feb 25 '15 at 16:03
@khelwood, it would be. But in the end, `range(10000)` will be faster than `xrange(10000)`, at least in python2.7. — wflynny, Feb 25 '15 at 16:13
@wflynny xrange creates an iterator, range creates a list (in memory), xrange is faster than range in python2.7 (xrange is removed in python 3 and range now creates an iterator) — no_name, Feb 25 '15 at 16:14
@no_name, Using python2.7, with `setup='dummy = xrange(10000)'` I get 0.797435998916626s and with `setup='dummy = range(10000)'`, I get 0.6557648181915283s. — wflynny, Feb 25 '15 at 16:16
@khelwood i modified my test and the same results (even 1 second slower) are there. — marcorossi, Feb 25 '15 at 16:16
@khelwood `xrange()` is just an iterator, it will hardly takes nano-seconds to create an iterator or generator, OTOH `range()` has to create a whole list in memory first before any iteration actually starts. — Ashwini Chaudhary, Feb 25 '15 at 16:39

score 4 · Answer 1 · edited May 23 '17 at 11:56

When we loop over an already created list we're actually iterating over its iterator and calling its next method each time, which simply yields the next item from the list as per the internal index(it->it_index) maintained by it. i.e No new object is created here. On the other hand range() in Python 3 or xrange() in Python 2 during each iteration Python has to create a new Long object, which can be expensive:

>>> timeit.timeit('for _ in dummy: continue', setup='dummy = xrange(10**4)', number=100000)
8.74455213546753
>>> timeit.timeit('for _ in dummy: continue', setup='dummy = [0] * 10000', number=100000)
7.1642138957977295

If instead of xrange() we use itertools.repeat with None we get some slight improvement because now we are not creating new object during each iteration, simply repeating the same object.

>>> timeit.timeit('for _ in repeat(None, 10000): continue', setup='from itertools import repeat', number=100000)
6.986715793609619

From Raymond Hettinger's answer:

It is faster to using itertools.repeat(None, times) to control the number of loops (this avoids creating new, unused integer objects on every iteration).

score 3 · Accepted Answer · answered Feb 25 '15 at 16:12

3

I did it in Python3 , but the same results arose. I put the range creation in setup for a more accurate comparison

In [1]: timeit.timeit('for _ in a: continue', setup='a=list(range(10000))', number=10000)
Out[1]: 1.195666481000444

In [2]: timeit.timeit('for _ in a: continue', setup='a=range(10000)', number=10000)
Out[2]: 2.4083170039994

I think that the main difference is that range is lazily generating _ value at each iteration, whereas if you use a list it just has to read them from memory. Compare to

In [3]: timeit.timeit('for _ in range(10000): continue', number=10000)
Out[3]: 4.166428555001403

In [4]: timeit.timeit('for _ in list(range(10000)): continue', number=10000)
Out[4]: 5.800707030000922

where we are taking the time needed to create the objects into account. Which shows the point of lazy evaluation.

answered Feb 25 '15 at 16:12

Evpok

4,273
3
34
46

yet, reading from memory is more expensive (even with pre-fetch to cache) than something that could be done completely in registers (xrange). – marcorossi Feb 25 '15 at 16:15
@marcorossi Are you sure that xrange is done completely in registers? – Wayne Werner Feb 25 '15 at 16:18
@WayneWerner no, I meant that it could be done by a smart interpreter/compiler. xrange is running in the interpreter with native code anyway. – marcorossi Feb 25 '15 at 16:21
Feel free to dig in https://hg.python.org/cpython/file/cb911e1fb3dc/Objects/rangeobject.c (probably around line 755) – Evpok Feb 25 '15 at 16:26
@Evpok, yes I think you're right. It's the object creation. – marcorossi Feb 25 '15 at 16:31

score 1 · Answer 3 · answered Feb 25 '15 at 16:07

1

If I had to guess it's because comparisons are expensive operations. And if xrange looks anything like this:

def xrange(limit):
    counter = 0
    while counter < limit:
        counter += 1

Then you're talking about 10,000 comparisons. As opposed to iterating over the list, which only has to raise StopIteration at the end of the list.

But I'm not sure about the internals, so I could be wrong.

answered Feb 25 '15 at 16:07

Wayne Werner

49,299
29
200
290

Indeed, @no_name, though that shouldn't make much of a difference (as you're just replacing 0 and 1 with `start` & `stop`). At least I would be massively surprised if it really did make that much difference. – Wayne Werner Feb 25 '15 at 16:10
1

@wflynny I tried with the setup code in both, with similar results - `xrange` (or `range` in py3) is marginally slower than iterating over the collection. – Wayne Werner Feb 25 '15 at 16:13
Also the code that raises the StopIteration will have to check the length of the array though. – marcorossi Feb 25 '15 at 16:13
Are you sure that it checks the length? I'm just reasoning, I don't know the source for sure, but if Python lists are implemented in C and are effectively a linked list, iteration over the list would be doing a comparison in C to check `null` on `next`. That's a much *much* faster comparison than anything that will happen in Python. – Wayne Werner Feb 25 '15 at 16:17
Oh yes, I meant that the interpreter is doing it. But then we go to the point where most of the iteration is done by the interpreter in C, more than just the comparison. – marcorossi Feb 25 '15 at 16:18
xrange is also native code built-in, not python code. I just checked. – marcorossi Feb 25 '15 at 16:23
1

@WayneWerner Python [[lists are not linked-lists](https://docs.python.org/2/faq/design.html#how-are-lists-implemented), they are just arrays. That's why indexing is so fast. And coming to comparisons, they are happening in list's methods as well: https://hg.python.org/cpython/file/325aec842e3e/Objects/listobject.c#l2900 – Ashwini Chaudhary Feb 25 '15 at 17:14

score 0 · Answer 4 · edited May 23 '17 at 12:20

As you are using xrange, I assume Python 2.x. The exact setup of the experiment is important.

Using dummy variable (xrange slower)

timeit.timeit('for _ in dummy: continue', setup='dummy = range(10000)', number=100000)

13.719306168122216

timeit.timeit('for _ in dummy: continue', setup='dummy = xrange(10000)', number=100000)

15.667266362411738

Without dummy variable (xrange faster)

However, if we take the dummy variable out of the picture:

timeit.timeit('for _ in range(10000): continue', number=100000)

20.79111238831547

timeit.timeit('for _ in xrange(10000): continue', number=100000)

15.494247599682467

Why?

Dummy variable difference

This indicates that the xrange variable is cheap to set up, but slightly more expensive to iterate through. In the first instance, you set up an object only once, but iterate through 100000 times. In the second, you set up 100000 times, iterating each once. Interesting - as the documentation for xrange would have you believe the opposite (my emphasis):

Like range(), but instead of returning a list, returns an object that generates the numbers in the range on demand. For looping, this is slightly faster than range() and more memory efficient.

Code difference

Looking at the C code where xrange is implemented, we see:

...

/*********************** Xrange Iterator **************************/

typedef struct {
    PyObject_HEAD
    long        index;
    long        start;
    long        step;
    long        len;
} rangeiterobject;

static PyObject *
rangeiter_next(rangeiterobject *r)
{
    if (r->index < r->len)
        return PyInt_FromLong(r->start + (r->index++) * r->step);
    return NULL;
}

...

So, each call to get the next number from xrange results in a comparison as suggested by @WayneWerner as a reason for slower iterating

EDIT

Note - I use range(10000) to contrast with xrange(10000), however, the results hold using the OP's [0]*10000 also

Python: why is it faster to iterate over a list than iterating over an xrange() generator define on its length?

4 Answers4

Using dummy variable (xrange slower)

Without dummy variable (xrange faster)

Why?