(Update: Might only happen in CPython 3.8 32-bit for Windows, so don't be surprised if you can't reproduce it in other versions. See tables in the Update section.)
Both iter
and reversed
result in specialized iterators for lists:
>>> iter([1, 2, 3])
<list_iterator object at 0x031495C8>
>>> reversed([1, 2, 3])
<list_reverseiterator object at 0x03168310>
But the reversed
one is much slower:
> python -m timeit -s "a = list(range(1000))" "list(iter(a))"
50000 loops, best of 5: 5.76 usec per loop
> python -m timeit -s "a = list(range(1000))" "list(reversed(a))"
20000 loops, best of 5: 14.2 usec per loop
And I can consistently reproduce it. I later tried iter
five more times, took 5.98, 5.84, 5.85, 5.87, 5.86. Then reversed
five more times, took 14.3, 14.4, 14.4, 14.5, 14.3.
I thought maybe iter
benefits from increasing memory locations of the list's elements, so I tried reversing the list beforehand. Same picture:
> python -m timeit -s "a = list(range(1000)); a.reverse()" "list(iter(a))"
50000 loops, best of 5: 5.73 usec per loop
> python -m timeit -s "a = list(range(1000)); a.reverse()" "list(reversed(a))"
20000 loops, best of 5: 14.1 usec per loop
Same picture with sum
as well:
> python -m timeit -s "a = list(range(1000))" "sum(iter(a))"
20000 loops, best of 5: 10.7 usec per loop
> python -m timeit -s "a = list(range(1000))" "sum(reversed(a))"
10000 loops, best of 5: 20.9 usec per loop
And with identical elements, too:
> python -m timeit -s "a = [None] * 1000" "list(iter(a))"
50000 loops, best of 5: 6.35 usec per loop
> python -m timeit -s "a = [None] * 1000" "list(reversed(a))"
20000 loops, best of 5: 14.5 usec per loop
Why is the reverse iterator so much slower?
I'm using CPython 3.8.1 32 bit on Windows 10 pro 64 bit version 1903 with an Intel i5-7200U (it's a HUAWEI MateBook X). No special configuration, just a normal Python install on a normal Windows install.
Update: I ran a larger automated test with eight different Python versions (all freshly installed with default settings) on another machine (Pentium N3700, Windows 10 Pro 64-bit 1903). Times in usec per loop:
32-bit 64-bit
CPython iter reversed iter reversed
3.5.4 16.6 17.0 15.2 16.2
3.6.8 16.8 17.2 14.9 15.8
3.7.6 16.5 16.9 14.8 15.5
3.8.1 16.3 22.1 14.6 15.5
Two things to note:
- Python 3.8.1 32-bit
reversed
is the only one much slower. Might explain why almost nobody else could reproduce it. - In all seven other versions,
reversed
was a bit slower thaniter
. About 0.4 usec in 32-bit and about 0.9 usec in 64-bit.
I ran those 16 tests in Round-robin fashion for ten rounds, and each time shown above is the best of its ten source times. Each of the 160 source times was done like this:
python.exe -m timeit -r 5 -s "a = list(range(1000))" "list(iter(a))"
or
python.exe -m timeit -r 5 -s "a = list(range(1000))" "list(reversed(a))"
The times for each of the 16 tests were pretty consistent. Full table (note that the round-robin means I ran these column by column, not row by row):
3.5.4 32-bit iter [16.7, 16.6, 17.3, 16.6, 16.7, 16.6, 16.6, 16.6, 16.6, 16.7]
3.5.4 32-bit reversed [17.1, 17.1, 17.1, 17.2, 17.1, 17.1, 17.0, 17.1, 17.1, 17.1]
3.5.4 64-bit iter [15.2, 15.4, 15.4, 15.4, 15.4, 15.4, 15.4, 15.3, 15.4, 15.3]
3.5.4 64-bit reversed [16.8, 16.2, 16.3, 16.3, 16.2, 16.2, 16.2, 16.2, 16.2, 16.3]
3.6.8 32-bit iter [17.3, 16.9, 16.8, 16.9, 16.9, 16.8, 16.9, 16.9, 16.8, 16.8]
3.6.8 32-bit reversed [17.2, 17.2, 17.2, 17.3, 17.3, 17.3, 17.3, 17.2, 17.2, 17.2]
3.6.8 64-bit iter [15.0, 14.9, 15.9, 14.9, 14.9, 15.0, 14.9, 14.9, 14.9, 14.9]
3.6.8 64-bit reversed [15.8, 15.9, 16.4, 15.9, 15.9, 16.0, 15.8, 15.9, 15.9, 15.8]
3.7.6 32-bit iter [16.6, 17.2, 16.6, 16.5, 16.7, 16.7, 16.5, 16.5, 16.5, 16.7]
3.7.6 32-bit reversed [17.2, 17.6, 17.0, 17.0, 16.9, 17.2, 17.3, 17.0, 17.5, 17.0]
3.7.6 64-bit iter [14.8, 15.1, 14.9, 14.9, 14.8, 15.1, 14.9, 14.8, 15.0, 14.9]
3.7.6 64-bit reversed [16.0, 20.1, 15.7, 15.6, 15.6, 15.6, 15.7, 15.7, 15.8, 15.5]
3.8.1 32-bit iter [16.4, 16.6, 16.3, 16.4, 16.5, 16.4, 16.5, 16.4, 16.8, 16.4]
3.8.1 32-bit reversed [22.3, 22.4, 22.2, 22.3, 22.3, 22.3, 22.5, 22.4, 22.3, 22.1]
3.8.1 64-bit iter [14.6, 15.1, 14.6, 14.7, 14.7, 14.7, 14.7, 14.6, 14.6, 14.6]
3.8.1 64-bit reversed [15.5, 16.1, 15.5, 15.6, 15.5, 15.5, 15.5, 15.5, 15.5, 15.5]
The same test on a list with a million values (list(range(250)) * 4000
). Times are msec per loop:
32-bit 64-bit
CPython iter reversed iter reversed
3.5.4 19.8 19.9 22.4 22.7
3.6.8 19.8 19.9 22.3 22.6
3.7.6 19.9 19.9 22.3 22.5
3.8.1 19.8 24.9 22.4 22.6
The variation is even smaller, except reversed
on 3.8.1 32-bit is much slower again.
One more, just with CPython 3.8.0 instead of 3.8.1, where it also happens.
32-bit 64-bit
CPython iter reversed iter reversed
3.5.4 19.5 19.6 21.9 22.2
3.6.8 19.5 19.7 21.8 22.1
3.7.6 19.5 19.6 21.7 22.0
3.8.0 19.4 24.5 21.7 22.1