Currently I am playing with Python performance, trying to speed up my programs (usually those which compute heuristics). I always used lists, trying not to get into numpy
arrays.
But recently I've heard that Python has 8.7. array — Efficient arrays of numeric values
so I thought I would try that one.
I wrote a piece of code to measure an array.count()
vs. a list.count()
, as I use it in many places in my code:
from timeit import timeit
import array
a = array.array('i', range(10000))
l = [range(10000)]
def lst():
return l.count(0)
def arr():
return a.count(0)
print(timeit('lst()', "from __main__ import lst", number=100000))
print(timeit('arr()', "from __main__ import arr", number=100000))
I was expecting a slight performance improvement when using array
. Well, this is what happened:
> python main.py
0.03699162653848456
74.46420751473268
So, according to timeit
the list.count()
is 2013x faster than the array.count()
. I definitely didn't expect that. So I've searched through SO, python docs etc. and the only thing I found was that the objects in array have to be first wrapped into int
-s, so this could slow things down, but I was expecting this to happen when creating an array.array
-instance, not when random accessing it (which I believe is what .count()
does).
So where's the catch?
Am I doing something wrong?
Or maybe I shouldn't use standard arrays and go straight to numpy.array
s?