I am working with large matrices(up to million X million).I want to column sum each column in a matrix and put the reciprocal of each column sum in the respective column elements where non zero elements are there.I have done two attempts on this but I still want a faster method of computation and since some columns are zero cant do direct np.reciprocal. Here are my attempts:
A=np.array([[0,1,1,1],[0,0,1,0],[0,1,0,0],[0,0,0,0]])
d=sc.shape(A)[0]
V=sc.zeros(d)
sc.sum(A,axis=0,out=V,dtype='int')
with sc.errstate(divide='ignore', invalid='ignore'):
Vs = sc.true_divide( 1, V )
Vs[ ~ sc.isfinite( Vs )] = 0 # -inf inf NaN
print Vs
Second attempt:
A=np.array([[0,1,1,1],[0,0,1,0],[0,1,0,0],[0,0,0,0]])
d=sc.shape(A)[0]
V=sc.zeros(d)
sc.sum(A,axis=0,out=V,dtype='int')
for i in range(0,d):
if V[i]!=0:
V[i]=1/V[i]
print V
Is there a faster way than this?As my running time is very poor. Thanks
edit1: Do you think changing everything to csr sparse matrix format would make it faster?