4

I am trying to speedup some sparse matrix-matrix multiplications in Python using Numba and it's JIT compiler. Unfortunately it doesn't support the SciPy library as I need it.

My solution is to translate the functions csr_matmat_pass1() and csr_matmat_pass2() from here into Python code. My code seems to work for matrices smaller than ~80x80 and delivers correct results. Here's my solution:

import scipy.sparse as sparse
import numpy as np

def csr_matmat_pass1(n_row, n_col, Ap, Aj, Bp, Bj):   
   mask = np.ones(n_col, dtype="int") * -1
   Cp = np.zeros(n_row+1, dtype="int")

   nnz = 0
   for i in range(n_row):
       row_nnz = 0

       for jj in range(Ap[i],Ap[i+1]):
           j = Aj[jj]
           for kk in range(Bp[j],Bp[j+1]):
               k = Bj[kk]

               if(mask[k] != i):
                   mask[k] = i
                   row_nnz += 1


       next_nnz = nnz + row_nnz;

       nnz = next_nnz;
       Cp[i+1] = nnz;
   return Cp            

def csr_matmat_pass2(n_row, n_col, Ap, Aj, Ax, Bp, Bj, Bx, Cp):
   nextV = np.ones(n_col, dtype="int") * -1
   sums = np.zeros(n_col)

   nnz = 0

   Cp[0] = 0
   #preallocate space
   sizeC = max(len(Ax),len(Bx)) 
   Cj = np.zeros(sizeC, dtype="int")
   Cx = np.zeros(sizeC)

   for i in range(n_row):
       head   = -2
       length =  0

       jj_start = Ap[i]
       jj_end   = Ap[i+1]
       for jj in range(jj_start,jj_end):   
           j = Aj[jj]
           v = Ax[jj]

           kk_start = Bp[j]
           kk_end   = Bp[j+1]
           for kk in range(kk_start,kk_end):
               k = Bj[kk]

               sums[k] += v*Bx[kk]

               if(nextV[k] == -1):
                   nextV[k] = head
                   head  = k
                   length += 1

       for jj in range(length):

           if(sums[head] != 0.0):
               Cj[nnz] = head
               Cx[nnz] = sums[head]
               nnz += 1


           temp = head     
           head = nextV[head]

           nextV[temp] = -1
           sums[temp] =  0


       Cp[i+1] = nnz
   return Cp, Cj, Cx

#calculate random sparse matrices A,B
mSize = 50
A = sparse.random(mSize, mSize, 0.01).tocsr()
B = sparse.random(mSize, mSize, 0.01).tocsr() 

#calculate sparse C  
Cp = csr_matmat_pass1(np.shape(A)[0], np.shape(B)[1], A.indptr, A.indices, B.indptr, B.indices)
Cp, Cj, Cx = csr_matmat_pass2(np.shape(A)[0], np.shape(B)[1], A.indptr, A.indices, A.data, B.indptr, B.indices, B.data, Cp)
#generate numpy sparse matrix from Cx, Cj, Cp    
C = sparse.csr_matrix((Cx,Cj,Cp),shape=(nRow,nCol))


diffC = A.dot(B) - C

#validate function -> check if any nonzero element is present. If so -> calc is wrong
if np.any(diffC.todense()): UserWarning('Calculations are wrong')

When increasing the size of the matrices (lets say mSize=100) I get the following error:

line 168, in csr_matmat_pass2 Cj[nnz] = head

IndexError: index 72 is out of bounds for axis 0 with size 72

I assume the error is in my python translation rather than in the C++ code (since it is from the scipy library). Also Cp has greater entries than the size of the matrices A, B. For that reason there must be an error in the translation of csr_matmat_pass1(). Unfortunately I cannot find any syntax errors and don't know why nnz gets bigger than it should.

CDspace
  • 2,639
  • 18
  • 30
  • 36
  • I made this kind of translation in http://stackoverflow.com/a/37540149/901925 (last May). There the question was about replacing the `scipy` code with calls to the `mkl` library. What do you hope to accomplish by replacing `c` code with `numba` compiled code? – hpaulj Jul 22 '16 at 15:56
  • Where does that `sizeC` in `pass2` come from? In my translation this comes from the value of the last element of `Cp` as calculated in `pass1`. Your code doesn't seem to make any use of the results of `pass1`. Are you new to `c` as well as `python`? – hpaulj Jul 22 '16 at 17:26
  • In the `csr.h` file, the `c` does not allocate `Cx`, `Cj`. That's done in the `compressed.py` code. And it uses `nnz`, which is `nnz = indptr[-1]`. – hpaulj Jul 22 '16 at 18:19
  • That resolved my problem. Thank you! 1. I try to get a speed increase using the JIT compiler. 2. That was the error. Indeed my c skills are quite rusty and the problem was the wrong allocation with sizeC. 3. After pass1 I had to replace the allocation of Cj, Cx and Cp as follows `Cp = csr_matmat_pass1(n_row, n_col, Ap, Aj, Bp, Bj, mask, Cp)` `#allocate arrays for pass2 Cj=np.zeros(Cp[-1], int) Cx=np.zeros(Cp[-1], A.dtype) Cp=np.zeros(n_row+1,int)` – Florian Shepherd Jul 25 '16 at 08:28

0 Answers0