5

I need to perform a set of operations on a scipy sparse matrix in a Cython method.

To efficiently apply these I need access to lil_matrix representation. The lil (linked-list sparse matrix) data representation in python uses list of lists with different lengths.

How can I efficiently pass a list of list of different length to cython (without copying)? Is there any other way to access lil-matrices in cython?

Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
SKV
  • 274
  • 4
  • 13
  • 2
    maybe this answer will give you some insight: http://stackoverflow.com/a/25295789/832621 – Saullo G. P. Castro Aug 21 '14 at 19:02
  • The problem with lil matrices is that their data content are not `Numpy` arrays like `CSR` or `COO`. – SKV Aug 21 '14 at 19:25
  • I see... you should consider [using `lil_matrix` only for slicing and to transform to other sparse matrix types](http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.lil_matrix.html)... then you can use the solution proposed in that answer, where the data type is known... – Saullo G. P. Castro Aug 21 '14 at 19:53
  • The problem is that the sub-matrices are used in a loop and I need to keep the lil-matrix and avoid paying the conversion cost every time. It all comes down to passing a list of list to cython efficiently. – SKV Aug 22 '14 at 16:44

1 Answers1

5

The example below iterates over a lil_matrix and calculates the sum for each row.

Note I am doing no declarations and even though it is extremely fast because Cython is already optimized for built-in types such as lists. The timings are also shown below...

import time

import numpy as np
cimport numpy as np

from scipy.sparse import lil_matrix

cdef iter_over_lil_matrix(m):
    cdef list sums, data_row
    sums = []
    for data_row in m.data:
        s = 0
        for value in data_row:
            s += value
        sums.append(s)
    return sums

def main():
    a = np.random.random((1e4*1e4))
    a[a>0.1] = 0
    a = a.reshape(1e4,1e4)
    m = lil_matrix(a)

    t0 = time.clock()
    sums = iter_over_lil_matrix(m)
    t1 = time.clock()
    print 'Cython lil_matrix Time', t1-t0

    t0 = time.clock()
    array_sums = a.sum(axis=1)
    t1 = time.clock()
    print 'Numpy ndarray Time', t1-t0

    t0 = time.clock()
    lil_sums = m.sum(axis=1)
    t1 = time.clock()
    print 'lil_matrix Time', t1-t0

    mcsr = m.tocsr()
    t0 = time.clock()
    csr_sums = mcsr.sum(axis=1)
    t1 = time.clock()
    print 'csr_matrix Time', t1-t0

    assert np.allclose(array_sums, sums)
    assert np.allclose(array_sums, np.asarray(lil_sums).flatten())
    assert np.allclose(array_sums, np.asarray(csr_sums).flatten())

Timings in seconds - only about 2 times slower than the super-optimized NumPy :D, much faster than the lil_matrix.sum() method because it converts to csr_matrix() before, as clarified by @hpaulj and confirmed by the results below. Note that the csr_matrix.sum() over the columns is almost one order of magnitude faster than the dense sum.

Cython lil_matrix Time 0.183935034665
Numpy ndarray Time 0.106583238273
lil_matrix Time 2.47158218631
csr_matrix Time 0.0140050888745

Things that will slow down the code:

  • use of for i in range(len(m.data)): with data_row = m.data[i]
  • declare buffers like np.ndarray[object, ndim=1] data with data=m.data

Things that did not affect:

  • boundscheck or wraparound
Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
  • 1
    Thanks very much, but the comparison is not fair here. You should compare lil-matrix summation with this cython function: i.e. `m.sum(axis = 1)` instead of `a.sum(axis=1)`. Could you also post the results for that please (preferable over a square matrix?) Also the matrix `a` here is not very sparse. – SKV Aug 22 '14 at 19:25
  • @Siamak I've updated the answer using a sparser matrix and comparing with `m.sum(axis=1)`... note that the proposed approach gets closer to and eventually it will be faster than NumPy's performance for sparser matrices... – Saullo G. P. Castro Aug 22 '14 at 19:51
  • Thanks Saullo, the numbers are surprising! I guess one reason for bad performance of `scipy` is that the matrix is not very sparse. – SKV Aug 22 '14 at 21:37
  • 3
    The `sparse` `sum` is a dot product with the appropriate vector of ones. To do that math, the `lil` is converted to `csr`. With your sample matrix, the `csr` sum is 5x faster than the dense sum. The `lil` sum is slow because of the time it takes to convert it to `csr`. – hpaulj Aug 22 '14 at 22:36
  • @hpaulj thank your for the valuable comments. I was not aware of that. I've updated the answer adding the timing for `csr_matrix` to confirm your comment – Saullo G. P. Castro Aug 23 '14 at 07:17