Speed up left matmul for scipy.sparse.csr_matrix

Question

I need to perform the following matrix multiplication : x * A[idx] where A is a scipy.sparse.csr_matrix, and idx is a np.array index. I can't change it to csc_matrix because of the indexing. It seems to be 50x slower than the right matrix multiplication A[idx] * x, and only slightly faster than the left matrix multiplication (over the full index) u * A, even when len(idx) << A.shape[0]. How can I speed this up?

For what it's worth `A[idx]` is implemented as a matmul using a sparse extractor matrix constructed from `idx`. — hpaulj, May 25 '20 at 15:37
Based on the answer to [this question](https://stackoverflow.com/questions/61867890/what-is-the-fastest-way-to-compute-a-sparse-gram-matrix-in-python) I would recommend using MKL with the [sparse_dot_mkl](https://pypi.org/project/sparse-dot-mkl/) wrapper. — CJR, May 25 '20 at 17:01
Linking [this question](https://stackoverflow.com/questions/46924092/how-to-parallelize-this-python-for-loop-when-using-numba), which has a nice parallel solution to a similar question: M.dot(v). — Geoffrey Negiar, Jun 18 '20 at 15:48

Geoffrey Negiar · Accepted Answer · 2020-06-02T10:07:30.960

Since I didn't find a solution elsewhere, here's what I ended up doing.

Using numba, I wrote:

@njit(nogil=True)
def fast_csr_vm(x, data, indptr, indices, d, idx):
    """
    Returns the vector matrix product x * M[idx]. M is described
    in the csr format.
    Returns x * M[idx]
    x: 1-d iterable
    data: data field of a scipy.sparse.csr_matrix
    indptr: indptr field of a scipy.sparse.csr_matrix
    indices: indices field of a scipy.sparse.csr_matrix
    d: output dimension
    idx: 1-d iterable: index of the sparse.csr_matrix
    """
    res = np.zeros(d)
    assert x.shape[0] == len(idx)
    for k, i in np.ndenumerate(idx):
        for j in range(indptr[i], indptr[i+1]):
            j_idx = indices[j]
            res[j_idx] += x[k] * data[j]
    return res

Especially since you aren't returning a sparse matrix, direct action on the matrix attributes makes a lot of sense — hpaulj, May 25 '20 at 16:08

Speed up left matmul for scipy.sparse.csr_matrix

1 Answers1