9

I'm learning python recently, and is doing many practice with the language.

One thing I found interesting is that, when I read from an array, it's almost half of the time slower than list. Does somebody know why?

here's my code:

from timeit import Timer
import array

t = 10000
l = range(t)
a = array.array('i', l)
def LIST():
    for i in xrange(t):
        l[i]

def ARRAY():
    for i in xrange(t):
        a[i]

print Timer(LIST).timeit(1000);
print Timer(ARRAY).timeit(1000);

the output is:

0.813191890717
1.16269612312

which indicates that reading array is slower than list. I think array is a fixed size memory, while list is a dynamic structure. So I assumed that array would be faster than list.

Does anyone has any explanation?

Martin Geisler
  • 72,968
  • 25
  • 171
  • 229
FrostNovaZzz
  • 802
  • 3
  • 10
  • 22
  • 1
    possible dupe/answer: http://stackoverflow.com/questions/176011/python-list-vs-array-when-to-use - Basically array.array is a wrapper around a C array so I think there is overhead when accessing it. Don't use it for math. – NG. Jul 17 '10 at 14:16
  • Trying to second-guess Python efficiencies - especially for those coming from a C-like background - is often counter-intuitive. Code clearly first, then optimize if you measure a performance problem; this applies to C also, but because the language elements are so close to the machine people often forget. – msw Jul 17 '10 at 14:31
  • 1
    For math you may want to use numpy (not available for python 3 yet), only god knows why numpy is not standard library. – Robert William Hanks Jul 17 '10 at 14:32
  • 1
    numpy is very close to available on Python 3: http://www.mail-archive.com/numpy-discussion@scipy.org/msg26524.html – Ned Batchelder Jul 17 '10 at 14:40

3 Answers3

11

lists are "dynamically growing vectors" (very much like C++'s std::vector, say) but that in no way slows down random access to them (they're not linked lists!-). Lists' entries are references to Python objects (the items): accessing one just requires (in CPython) an increment of the item's reference count (in other implementations, based on more advanced garbage collection, not even that;-). Array's entries are raw bits and bytes: accessing one requires a fresh new Python object to be synthesized based on that binary value. So e.g.:

$ python -mtimeit -s'import array; c=array.array("B", "bzap")' 'c[2]'
10000000 loops, best of 3: 0.0903 usec per loop
$ python -mtimeit -s'c=list("bzap")' 'c[2]'
10000000 loops, best of 3: 0.0601 usec per loop

30 nanoseconds' extra time for an access doesn't seem too bad;-).

As an aside, note that timeit is MUCH nicer to use from the command line -- automatic choice of repetition, unit of measure shown for the time, etc. That's how I always use it (importing a custom-coded module with functions to be called, if needed -- but here there is no need for that) -- it's so much handier than importing and using it from a module!

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
7

It takes time to wrap a raw integer into a Python int.

kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
  • 2
    Quite so. The function LIST just has to increment and then decrement the reference count for each list element. The ARRAY function on the other hand has to allocate memory for each of the integers (except the smaller ones which have an optimisation) and then free it again. – Duncan Jul 17 '10 at 15:18
1

The Python lists really resemble some ways normal arrays, they are not Lisp lists, but they have fast random access.

Tony Veijalainen
  • 5,447
  • 23
  • 31