I have a situation in which I need to extract a single row from a sparse matrix and take its dot product with a dense row. Using scipy's csr_matrix, this appears to be significantly slower than using numpy's dense array multiplication. This is surprising to me because I expected that sparse dot product would involve significantly fewer operations. Here is an example:
import timeit as ti
sparse_setup = 'import numpy as np; import scipy.sparse as si;' + \
'u = si.eye(10000).tocsr()[10];' + \
'v = np.random.randint(100, size=10000)'
dense_setup = 'import numpy as np; u = np.eye(10000)[10];' + \
'v = np.random.randint(100, size=10000)'
ti.timeit('u.dot(v)', setup=sparse_setup, number=100000)
2.788649031019304
ti.timeit('u.dot(v)', setup=dense_setup, number=100000)
2.179030169005273
For matrix-vector multiplication, the sparse representation wins hands down, but not in this case. I tried with csc_matrix, but performance is even worse:
>>> sparse_setup = 'import numpy as np; import scipy.sparse as si;' + \
... 'u = si.eye(10000).tocsc()[10];' + \
... 'v = np.random.randint(100, size=10000)'
>>> ti.timeit('u.dot(v)', setup=sparse_setup, number=100000)
7.0045155879925005
Why does numpy beat scipy.sparse in this case? Is there a matrix format that's faster for these kind of computations?