efficient way of column sum and reciprocal in a matrix

Question

I am working with large matrices(up to million X million).I want to column sum each column in a matrix and put the reciprocal of each column sum in the respective column elements where non zero elements are there.I have done two attempts on this but I still want a faster method of computation and since some columns are zero cant do direct np.reciprocal. Here are my attempts:

A=np.array([[0,1,1,1],[0,0,1,0],[0,1,0,0],[0,0,0,0]])
d=sc.shape(A)[0]


V=sc.zeros(d)

sc.sum(A,axis=0,out=V,dtype='int')
with sc.errstate(divide='ignore', invalid='ignore'):

    Vs = sc.true_divide( 1, V )
    Vs[ ~ sc.isfinite( Vs )] = 0  # -inf inf NaN

print Vs

Second attempt:

A=np.array([[0,1,1,1],[0,0,1,0],[0,1,0,0],[0,0,0,0]])
d=sc.shape(A)[0]

V=sc.zeros(d)

sc.sum(A,axis=0,out=V,dtype='int')

for i in range(0,d):
    if V[i]!=0:                       
        V[i]=1/V[i]
print V

Is there a faster way than this?As my running time is very poor. Thanks

edit1: Do you think changing everything to csr sparse matrix format would make it faster?

What's the slow part? The sum? the divide? the testing? For large `d` I expect the iterative to be quite slow. Unless your matrix is very sparse (10% or less) sparse matrices won't help. And the sparse row sum returns a dense matrix. — hpaulj, Jun 25 '17 at 19:02

score 1 · Accepted Answer · answered Jun 25 '17 at 19:39

NumPy: Return 0 with divide by zero

discusses various divide by zero options. The accepted answer looks a lot like your first try. But there's a new answer that might (?) be faster

https://stackoverflow.com/a/37977222/901925

In [240]: V=A.sum(axis=0)
In [241]: np.divide(1,V,out=np.zeros(V.shape),where=V>0)
Out[241]: array([ 0. ,  0.5,  0.5,  1. ])

Your example is too small to make meaningful time tests on. I don't have any intuition about the relative speeds (beyond my comment).

A recent SO question pointed out that the out parameter is required with where in the latest release (1.13) but optional in earlier ones.

efficient way of column sum and reciprocal in a matrix

1 Answers1