I am working with scipy's csc
sparse matrix and currently a major bottleneck in the code is a line similar to the following
for i in range(multiply_cols.shape[0]):
F = F - factor*values[i]*mat.getcol(multiply_cols[i])
The matrices that I am working with are extremely large, of size typically more than 10**6x10**6
and I don't want to convert them to dense matrix. In fact I have a restriction to always have the matrix in csc
format. My attempts show that converting to coo_matrix
or lil_matrix
also does not pay off.
Here is my rudimentary attempts using csc
, csr
and coo
:
n=1000
sA = csc_matrix(np.random.rand(n,n))
F = np.random.rand(n,1)
multiply_cols = np.unique(np.random.randint(0,int(0.6*n),size=n))
values = np.random.rand(multiply_cols.shape[0])
def foo1(mat,F,values,multiply_cols):
factor = 0.75
for i in range(multiply_cols.shape[0]):
F = F - factor*values[i]*mat.getcol(multiply_cols[i])
def foo2(mat,F,values,multiply_cols):
factor = 0.75
mat = mat.tocsr()
for i in range(multiply_cols.shape[0]):
F = F - factor*values[i]*mat.getcol(multiply_cols[i])
def foo3(mat,F,values,multiply_cols):
factor = 0.75
mat = mat.tocoo()
for i in range(multiply_cols.shape[0]):
F = F - factor*values[i]*mat.getcol(multiply_cols[i])
def foo4(mat,F,values,multiply_cols):
factor = 0.75
mat = mat.tolil()
for i in range(multiply_cols.shape[0]):
F = F - factor*values[i]*mat.getcol(multiply_cols[i])
and timing them I get:
In [41]: %timeit foo1(sA,F,values,multiply_cols)
10 loops, best of 3: 133 ms per loop
In [42]: %timeit foo2(sA,F,values,multiply_cols)
1 loop, best of 3: 999 ms per loop
In [43]: %timeit foo3(sA,F,values,multiply_cols)
1 loop, best of 3: 6.38 s per loop
In [44]: %timeit foo4(sA,F,values,multiply_cols)
1 loop, best of 3: 45.1 s per loop
So certainly coo_matrix
and lil_matrix
are not a good choice here. Does anyone know a faster way of doing this. Is it a good option to retrieve the underlyng indptr
, indices
and data
have a custom cython
solution?