Why list comprehension is much faster than numpy for multiplying arrays?

Question

Recently I answered to THIS question which wanted the multiplication of 2 lists,some user suggested the following way using numpy, alongside mine which I think is the proper way :

(a.T*b).T

Also I found that aray.resize() has a same performance like that. any way another answer suggested a solution using list comprehension :

[[m*n for n in second] for m, second in zip(b,a)]

But after the benchmark I saw that the list comprehension performs very faster than numpy :

from timeit import timeit

s1="""
a=[[2,3,5],[3,6,2],[1,3,2]]
b=[4,2,1]

[[m*n for n in second] for m, second in zip(b,a)]
"""
s2="""
a=np.array([[2,3,5],[3,6,2],[1,3,2]])
b=np.array([4,2,1])

(a.T*b).T
"""

print ' first: ' ,timeit(stmt=s1, number=1000000)
print 'second : ',timeit(stmt=s2, number=1000000,setup="import numpy as np")

result :

 first:  1.49778485298
second :  7.43547797203

As you can see numpy is approximately 5 time faster. but most surprising thing was that its faster without using transpose, and for following code :

a=np.array([[2,3,5],[3,6,2],[1,3,2]])
b=np.array([[4],[2],[1]])

a*b

The list comprehension still was 5 time faster.So besides of this point that list comprehensions performs in C here we used 2 nested loop and a zip function So what can be the reason? Is it because of operation * in numpy?

Also note that there is no problem with timeit here I putted the import part in setup.

I also tried it with larger arras, the difference gets lower but still doesn't make sense :

s1="""
a=[[2,3,5],[3,6,2],[1,3,2]]*10000
b=[4,2,1]*10000

[[m*n for n in second] for m, second in zip(b,a)]
"""
s2="""
a=np.array([[2,3,5],[3,6,2],[1,3,2]]*10000)
b=np.array([4,2,1]*10000)

(a.T*b).T

"""



print ' first: ' ,timeit(stmt=s1, number=1000)
print 'second : ',timeit(stmt=s2, number=1000,setup="import numpy as np")

result :

 first:  10.7480301857
second :  13.1278889179

@csunday95 (in the linked question) used arrays of size 100 and found that `(a.T*b).T` was about 15x faster than `a*np.vstack(b) `. Is that not the case, or am I misinterpreting the results? — TigerhawkT3, Jul 23 '15 at 21:54

unutbu · Accepted Answer · 2015-07-24T01:07:27.843

Creation of numpy arrays is much slower than creation of lists:

In [153]: %timeit a = [[2,3,5],[3,6,2],[1,3,2]]
1000000 loops, best of 3: 308 ns per loop

In [154]: %timeit a = np.array([[2,3,5],[3,6,2],[1,3,2]])
100000 loops, best of 3: 2.27 µs per loop

There can also fixed costs incurred by NumPy function calls before the meat of the calculation can be performed by a fast underlying C/Fortran function. This can include ensuring the inputs are NumPy arrays,

These setup/fixed costs are something to keep in mind before assuming NumPy solutions are inherently faster than pure-Python solutions. NumPy shines when you set up large arrays once and then perform many fast NumPy operations on the arrays. It may fail to outperform pure Python if the arrays are small because the setup cost can outweigh the benefit of offloading the calculations to compiled C/Fortran functions. For small arrays there simply may not be enough calculations to make it worth it.

If you increase the size of the arrays a bit, and move creation of the arrays into the setup, then NumPy can be much faster than pure Python:

import numpy as np
from timeit import timeit

N, M = 300, 300

a = np.random.randint(100, size=(N,M))
b = np.random.randint(100, size=(N,))

a2 = a.tolist()
b2 = b.tolist()

s1="""
[[m*n for n in second] for m, second in zip(b2,a2)]
"""

s2 = """
(a.T*b).T
"""

s3 = """
a*b[:,None]
"""

assert np.allclose([[m*n for n in second] for m, second in zip(b2,a2)], (a.T*b).T)
assert np.allclose([[m*n for n in second] for m, second in zip(b2,a2)], a*b[:,None])

print 's1: {:.4f}'.format(
    timeit(stmt=s1, number=10**3, setup='from __main__ import a2,b2'))
print 's2: {:.4f}'.format(
    timeit(stmt=s2, number=10**3, setup='from __main__ import a,b'))
print 's3: {:.4f}'.format(
    timeit(stmt=s3, number=10**3, setup='from __main__ import a,b'))

yields

s1: 4.6990
s2: 0.1224
s3: 0.1234

So, the issue is that the test here was including the creation of the data structures, and, with that removed, `numpy` would indeed be faster? — TigerhawkT3, Jul 23 '15 at 21:58
Yeah I think this is the reason because when I used numpy array in list comprehension it gets slower that numpy approach! — Mazdak, Jul 23 '15 at 22:00
What shell/interpreter are you using where you can do `%timeit python_expression`? — Steven Rumbalski, Jul 23 '15 at 22:06
Most of the problem is that you when you create the numpy array in this way you first have to create the python list, then create the numpy array from that list. Compare the speed of `list(range(1000000))` versus `np.arange(1000000)` — Dunes, Jul 23 '15 at 22:08
@TigerhawkT3: The creation time is one issue, the other is that the arrays are too small to make offloading the paltry number of calculations to NumPy's fast C/Fortran functions worth it. — unutbu, Jul 23 '15 at 22:10
In `ipython`, `%%timeit` lets you put the setup code in the first line, and the timed code in subsequent lines. — hpaulj, Jul 23 '15 at 23:44

Why list comprehension is much faster than numpy for multiplying arrays?

1 Answers1

Linked

Related