15

Quick question about the built in python list object. Say you have a list with the numbers 0 - 99. You are writing a program that takes the last item in the list and uses it for some other purpose. Is it more efficient to use list[-1] than to use list[99]? In other words, does python iterate through the whole list in either case?

Thanks for your help.

Madison May
  • 2,723
  • 3
  • 22
  • 32

7 Answers7

23

Python does not iterate through lists to find a particular index. Lists are arrays (of pointers to elements) in contiguous memory and so locating the desired element is always a simple multiplication and addition. If anything, list[-1] will be slightly slower because Python needs to add the negative index to the length to get the real index. (I doubt it is noticeably slower, however, because all that's done in C anyway.)

kindall
  • 178,883
  • 35
  • 278
  • 309
  • 5
    "If anything, list[-1] is slightly slower because Python needs to get the length of the list and add the negative index to it." That is true if the list is a known constant length. But normally you would not write `lst[99]` and would have to call `len()` anyway. – David Heffernan Jul 09 '12 at 17:38
  • 4
    @David - a list is always a known constant length - else it's not a list – Jon Clements Jul 09 '12 at 17:39
  • @DavidHeffernan: what would be an example of a list which doesn't have a known constant length? – DSM Jul 09 '12 at 17:40
  • 1
    It needs the length anyway to check bounds – Antoine Jul 09 '12 at 17:40
  • @Antoine has the right answer. So with the negative index you'd just have an extra addition. You also have a check for underflow in that case, but that's symmetrical to the bounds check for positive indices. – kindall Jul 09 '12 at 17:44
  • `isinstance(list(), collections.Sized) is True` - ergo `__len__` is implemented – Jon Clements Jul 09 '12 at 17:44
  • 6
    @JonClements Known by the programmer. Clearly the list object knows. But kindall is suggesting that `lst[-1]` would be slower, but you could only write `lst[99]` if you knew the list had 100 elements. – David Heffernan Jul 09 '12 at 17:46
  • Updated my answer to remove "get the length" since as Antoine points out, Python's going to do that anyway for bounds checking. – kindall Jul 09 '12 at 17:47
  • Thanks for the response, kindall. That cleared up quite a bit of confusion in one small paragraph. – Madison May Jul 09 '12 at 19:49
8

Why not test and see?

import timeit
t=timeit.timeit('mylist[99]',setup='mylist=list(range(100))',number=10000000)
print (t)
t=timeit.timeit('mylist[-1]',setup='mylist=list(range(100))',number=10000000)
print (t)

Of course, as you can see by running this a couple times, there's really no (noticable) difference for the reasons pointed out in the other answers.

mgilson
  • 300,191
  • 65
  • 633
  • 696
6

you can timeit:

>>> import timeit
>>> timeit.Timer('values[99]', 'values = range(100)').timeit(number = 10**7)
0.44513392448425293
>>> timeit.Timer('values[99]', 'values = range(100)').timeit(number = 10**7)
0.45273900032043457
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**7)
0.44431495666503906
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**7)
0.44684290885925293
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**7)
0.44867610931396484
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**8)
4.4455509185791016
>>> timeit.Timer('values[99]', 'values = range(100)').timeit(number = 10**8)
4.4184651374816895
>>> timeit.Timer('values[99]', 'values = range(100)').timeit(number = 10**8)
4.4276700019836426
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**8)
4.4026989936828613
>>> timeit.Timer('values[-1]', 'values = range(100)').timeit(number = 10**8)
4.4386618137359619
>>> timeit.Timer('values[99]', 'values = range(100)').timeit(number = 10**8)
4.3991479873657227
>>> 

theres really no difference, though if you truly want the last item values[-1] seems be the easiest/safest approach being that it will always grab the last item regardless of the length of the list, as long as its not an empty list,

that would throw an exception:

>>> [][-1]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> 

In other words, does python iterate through the whole list in either case?

In neither case would python iterate through the list.

I was actually curious to see if python did anything differently, so I disassemble the code:

>>> import dis
>>> def test_0():
...     values = range(100)
...     return values[99]
...
>>> def test_1():
...     values = range(100)
...     return values[-1]
...
>>> dis.dis(test_0)
2           
    0 LOAD_GLOBAL              0 (range)
    3 LOAD_CONST               1 (100)
    6 CALL_FUNCTION            1
    9 STORE_FAST               0 (values)

3          
    12 LOAD_FAST                0 (values)
    15 LOAD_CONST               2 (99)
    18 BINARY_SUBSCR
    19 RETURN_VALUE
>>> dis.dis(test_1)
2           
    0 LOAD_GLOBAL              0 (range)
    3 LOAD_CONST               1 (100)
    6 CALL_FUNCTION            1
    9 STORE_FAST               0 (values)

3          
    12 LOAD_FAST                0 (values)
    15 LOAD_CONST               2 (-1)
    18 BINARY_SUBSCR
    19 RETURN_VALUE
>>>

and well it looks like at the instruction level, its pretty much identical, you would need to go into the CPython implementation, to see exactly whats going on when dealing with negative indices, but I think most of the other answers have already hinted at it.

$ python --version
Python 2.6.1 

out of curiosity I dugged even deeper and found this:

on python 2.7.1, but it should be the same on most python 2.*

./Python/ceval.c:

case BINARY_SUBSCR:
    w = POP();
    v = TOP();
    if (PyList_CheckExact(v) && PyInt_CheckExact(w)) {
        /* INLINE: list[int] */
        Py_ssize_t i = PyInt_AsSsize_t(w);
        if (i < 0)
            i += PyList_GET_SIZE(v);
        if (i >= 0 && i < PyList_GET_SIZE(v)) {
            x = PyList_GET_ITEM(v, i);
            Py_INCREF(x);
        }
        else
            goto slow_get;
    }
    else
      slow_get:
        x = PyObject_GetItem(v, w);
    Py_DECREF(v);
    Py_DECREF(w);
    SET_TOP(x);
    if (x != NULL) continue;
    break;

note the if (i < 0) i += PyList_GET_SIZE(v); so basically yes theres a slight constant overhead when dealing with negative indices.

and in case your curious ./Include/listobject.h: #define PyList_GET_ITEM(op, i) (((PyListObject *)(op))->ob_item[i]) so its basically a look up ;)

though again the difference is tiny and if your goal is to state you want the last value then this values[-1] is much more pythonic/clearer of making this intent, values[99] simply means get the 99th value value if the programmer doesn't know it has 100 values then he doesn't know its the last value.

Samy Vilar
  • 10,800
  • 2
  • 39
  • 34
  • 1
    *“as long as its not an empty list”* — to be fair: `lst[99]` would throw that error too :D – poke Jul 09 '12 at 18:26
  • 1
    @poke lol yes, but Im hoping the OP is aware of it :) I was just being explicit though maybe a bit to much ... – Samy Vilar Jul 09 '12 at 19:35
  • Thanks for the detailed response, samy.vilar :) That some neat insight into what's going on behind the scenes. I'll have to remember the dis module for later use. – Madison May Jul 09 '12 at 19:52
4

It doesn't iterate in either case. list[-1] is essentially identical to list[len(list) - 1]. A list is backed by an array, so lookups are constant time.

Laurence Gonsalves
  • 137,896
  • 35
  • 246
  • 299
3

List indexing in python is always of O(1).

For further details on time complexity follow this link

Somesh
  • 1,235
  • 2
  • 13
  • 29
1

A simple timeit test result in almost equal times, with the negative index being slightly slower

lis=list(xrange(10000000))
def f1():
    a,b,c,d,e=lis[-1],lis[-2],lis[-3],lis[-4],lis[-5]    

def f2():
    a,b,c,d,e=lis[0],lis[1],lis[2],lis[3],lis[4]

if __name__=="__main__":
    from timeit import Timer
    t = Timer("f1()", "from __main__ import f1")
    print t.timeit()
    t = Timer("f2()", "from __main__ import f2")
    print t.timeit()

output:

0.878027235305
0.845932094722
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
0

mylist[-1] is about 30% to 45% slower than mylist[99] on my machine.

>>> def test():
...     t99=timeit.timeit('mylist[99]',setup='mylist=list(range(100))',number=10000000)
...     t_1=timeit.timeit('mylist[-1]',setup='mylist=list(range(100))',number=10000000)
...     return (t_1, t99, (t_1-t99)*100/t99)
... 
>>> test()
(0.21327159996144474, 0.13456149981357157, 58.49377441312871)
>>> test()
(0.17166510014794767, 0.13119220011867583, 30.850081020563916)
>>> test()
(0.19142579985782504, 0.13216119981370866, 44.842661936827426)
>>> test()
(0.1880386001430452, 0.1329137000720948, 41.47420472159728)
>>> test()
(0.18617470003664494, 0.1398134999908507, 33.159315837761085)
>>> test()
(0.17610100004822016, 0.1407316999975592, 25.13243288560744)
>>> test()
(0.19496860005892813, 0.14028189983218908, 38.983432853531)
>>> test()
(0.19262430001981556, 0.13199010002426803, 45.938445371584066)
>>> 
ChrisGPT was on strike
  • 127,765
  • 105
  • 273
  • 257