I am trying to calculate sparse matrix calculations using scipy for an algorithm that require intensive dependent computations(PageRank) on very large RDF datasets. I want to use multiple cores for the scipy calculation within the following code
F = sparse.coo_matrix((y['data'],(y['row'],y['col'])),shape=y['shape'])
W = sparse.coo_matrix((y['data'],(y['row'],y['col'])),shape=y['shape'])
P = sparse.bmat([[None, W], [F, None]])
previous = np.ones(n)/n
ones = np.ones(n)/n
while error > epsilon:
tmp = np.array(previous)
previous = damping*P.T.dot(previous) + (1-damping)*ones
error = np.linalg.norm(tmp - previous)
if(printerror):
print(error)
I have searched every possible answer I could find and I tried integrating the mkl(anaconda build) within the code but the performance on multiple cores does not seem to scale up. I have come to an understanding that the scipy call csr.h does not make use of BLAS call, I am wondering whether I need to make changes and replace the call to csr_matvec in from scipy/sparsetools with an appropriate Sparse BLAS call since MKL has those and then link scipy to mkl. Am I understanding something wrong or missing something. I would really appreciate some help in the matter. One similar question is here Thanks!!