I'm trying to speed up some matrix inversion in a model I'm building, specifically by implementing some linear algebra routines in Cython. I have the code working, but am trying to optimize. Currently my Cython looks like this:
import cython
import numpy as np
cimport numpy as cnp
cimport scipy.linalg.cython_lapack as cython_lapack
cdef int ZERO = 0
ctypedef cnp.float64_t REAL_t
def inverse(mat,identity,pivots,k):
cdef int K = k
cdef REAL_t* mat_pointer = <REAL_t *>cnp.PyArray_DATA(mat)
cdef REAL_t* iden_pointer = <REAL_t *>cnp.PyArray_DATA(identity)
cdef int* piv_pointer = <int *>cnp.PyArray_DATA(pivots)
cython_lapack.dgesv(&K,&K,mat_pointer,&K,piv_pointer,iden_pointer,&K,&ZERO)
Once I compile (as linalg.pyx
), I can run this just fine, but because dgesv
modifies some of the input variables, I have to use a wrapper function and do some copying:
import linalg
import numpy as np
identity = np.eye(K) # K is the dimensionality of the square matrix
def lapack_inverse(a):
b = identity.copy()
linalg2.inverse(a,b,np.zeros(K,dtype=np.intc),K)
return b
For small K, we see pretty solid speed improvements (2-3x):
In [29]: K=10
In [30]: x = np.random.random((K,K))
In [31]: identity = np.eye(K)
In [32]: print np.allclose(np.linalg.inv(x),lapack_inverse(x))
True
In [33]: %timeit np.linalg.inv(x)
10000 loops, best of 3: 28 µs per loop
In [34]: %timeit lapack_inverse(x)
100000 loops, best of 3: 10.3 µs per loop
But for larger K, returns clearly diminish (e.g. for K=200):
In [8]: %timeit np.linalg.inv(x)
100 loops, best of 3: 10.1 ms per loop
In [9]: %timeit lapack_inverse(x)
100 loops, best of 3: 8.1 ms per loop
I'm thinking I might be able to squeeze better performance out of this by defining the identity and pivot arrays within C, rather than doing the copying and passing that I'm currently doing. In other words, I'd like to modify the Cython function so that it only requires that I pass the array to be inverted.
My questions are (a): Will this lead to speed improvements (even small ones)? and, assuming it will, (b) how do I go about it defining the pivot and identity arrays in Cython directly?