I have constructed the following case to test one-dimensional sparse matrix multiplication vs numpy arrays.
from scipy.sparse import csc_matrix
sp = csc_matrix((1, 36710))
sp[0,4162] = 0.2335
sp[0,21274] = 0.1367
sp[0,27322] = 0.261
sp[0,27451] = 0.9266
%timeit sp.dot(sp.T)
arr = sp.toarray()[0]
%timeit arr.dot(arr)
The result is as follows:
267 µs ± 6.58 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
9.9 µs ± 230 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Also they are both slower than a plain dict storing entries and a for-loop for the multiplication (~1µs).
The result is the same after trying different type of sparse matrix, including csr/coo. Why is sparse matrix multiplication ~30 times slower than numpy dense array multiplication? Is it because the matrix is too sparse?