If you need to micro-optimize like this, the only way to know what's fastest is to test.
The short version is: append
is faster than extend
, and Joran Beasley's suggestion itertools.chain.from_iterable
is slightly faster than either—but only if you replace the map
with a list comprehension.
So:
import itertools
import timeit
def makestuff(count):
for i in range(count):
yield (i, i)
def f_extend(mystuff):
mylist = []
for x, y in mystuff:
mylist.extend([int(x), int(y)])
return mylist
def f_append(mystuff):
mylist = []
for x, y in mystuff:
mylist.append(int(x))
mylist.append(int(y))
return mylist
def f_chainmap(mystuff):
return list(map(int, itertools.chain(*mystuff)))
def f_chaincomp(mystuff):
return [int(x) for x in itertools.chain(*mystuff)]
def f_chainfrommap(mystuff):
return list(map(int, itertools.chain.from_iterable(mystuff)))
def f_chainfromcomp(mystuff):
return [int(x) for x in itertools.chain.from_iterable(mystuff)]
def f_reducecompcomp(mystuff):
return [int(x) for x in reduce(operator.iadd, (list(y) for y in mystuff), [])]
def f_reducecompmap(mystuff):
return [int(x) for x in reduce(operator.iadd, map(list, mystuff), [])]
try:
import numpy
def f_numpy(mystuff):
return numpy.array(mystuff).flatten().tolist()
def f_numpy2(mystuff):
return numpy.array(list(mystuff)).flatten().tolist()
except:
pass
if __name__ == '__main__':
import sys
main = sys.modules['__main__']
count = int(sys.argv[1]) if len(sys.argv) > 1 else 10000
for f in dir(main):
if f.startswith('f_'):
func = getattr(main, f)
mystuff = makestuff(count)
testfunc = lambda: func(mystuff)
print('{}: {}'.format(f, timeit.timeit(testfunc, number=count)))
For Python 2, I tried the map
versions without the extra list
, and it was slightly faster, but still not nearly competitive. For Python 3, of course, the list
is necessary.
Here are my timings:
$ python testlister.py 1000000
f_append: 1.34638285637
f_chaincomp: 2.12710499763
f_chainfromcomp: 1.20806899071
f_chainfrommap: 2.77231812477
f_chainmap: 3.67478609085
f_extend: 1.38338398933
f_numpy: 5.52979397774
f_numpy2: 7.5826470852
f_reducecompcomp: 2.17834687233
f_reducecompmap: 3.16517782211
$ python3 ./testlister.py 1000000
f_append: 0.9949617639649659
f_chaincomp: 2.0521950440015644
f_chainfromcomp: 0.9724521590862423
f_chainfrommap: 2.5558998831082135
f_chainmap: 3.5766013460233808
f_extend: 1.149905970087275
f_reducecompcomp: 2.2112889911513776
f_reducecompmap: 1.9317334480583668
My python
is Apple's stock Python 2.7.2, while python3
is the python.org 3.3.0, both 64-bit, both on OS X 10.8.2, on a mid-2012 MacBook Pro with a 2.2GHz i7 and 4GB.
If you're using 32-bit Python on a POSIX platform, I've noticed in the past that somewhere in the not-too-distant past, iterators got an optimization that seems to have sped up many things in itertools
in 64-bit builds, but slowed them down in 32-bit. So, you may find that append
wins in that case. (As always, test on the platform(s) you actually care about optimizing.)
Ashwini Chaudhary linked to Flattening a shallow list in Python, which further linked to finding elements in python association lists efficiently. I suspect part of the difference between my results and theirs was improvements in iterators between 2.6.0 and 2.7.2/3.3.0, but the fact that we're explicitly using 2-element elements instead of larger ones is probably even more importantly.
Also, at least one of the answers claimed that reduce
was the fastest. The reduce
implementations in the original post are all terribly slow, but I was able to come up with faster versions. They still aren't competitive with append
or chain.from_iterable
, but they're in the right ballpark.
The f_numpy
function is heltonbiker's implementation. Since mystuff
is a 2D iterator, this actually just generates a 0D array wrapping the iterator, so all numpy
can do is add overhead. I was able to come up with an implementation that generates a 1D array of iterators, but that was even slower, because now all numpy
can do is add overhead N times as often. The only way I could get a 2D array of integers was by calling list
first, as in f_numpy2
, which made things even slower. (To be fair, throwing an extra list
into the other functions slowed them down too, but not nearly as bad as with numpy
.)
However, it's quite possible that I'm blanking here, and there is a reasonable way to use numpy
here. Of course if you can be sure either the top level mystuff
or each element in mystuff
is a list
or a tuple
, you can write something better—and if you can redesign your app so you have a 2D numpy.array
in the first place, instead of a general sequence of sequences, that'll be a whole different story. But if you just have a general 2D iteration of sequences, it doesn't seem very good for this use case.