4

I'm writing some code in python that requires frequently inverting large square matrices (100-200 rows/colums).

I'm hitting the limits of machine precision so have started trying to use mpmath to do arbitrary precision matrix inversion but it is very slow, even using gmpy.

Inverting random matrices of size 20, 30, 60 at precision 30 (decimal) takes ~ 0.19, 0.60, and 4.61 seconds whereas the same operations in mathematica take 0.0084, 0.015, and 0.055 seconds.

This is using python3 and mpmath 0.17 (not sure of gmpy version) on an arch linux machine. I'm not sure why mpmath is so much slower but is there any open source library that will approach the speeds mathematica manages for this (even 1/2 as fast would be good)?

I don't need arbitrary precision -- 128 bit would probably be good enough. I also just don't understand how mpmath can be so much slower. It must be using a very different matrix inversion algorithm. To be specific I'm using M**-1.

Is there a way to get it to use a faster algorithm or to speed it up.

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
user2153813
  • 103
  • 1
  • 6
  • Are you just using the matrix inverse to solve a set of equations? If so then there are more efficient methods that don't explicitly require the inverse. LU decomposition I believe is a bit more efficient. – Stuart Mar 10 '13 at 13:54
  • No i'm using it in a variation of a linear programming problem so I need the inverse to explicitly determine the cost function. In fact the problem is that as the cost gets very small in-exact inverses can cause a variety of problems. but I think going to 128 bit precision would be enough (at least for my current purposes). – user2153813 Mar 10 '13 at 14:51
  • Of course I never need the actual inverse, but rather I need it multiplied by some other matrices. So its not quite analogous to solving A.x=b since I need A^-1 * b with b a matrix not a vector. But maybe there's a way to generalize finding such solutions for matrices? OTOH I need to do this many times so it might really be better to find the inverse. – user2153813 Mar 10 '13 at 21:26
  • Evaluating A^-1 * b for many b is exactly what LU decomposition is good for. This is equally true for evaluating A^-1 * B where B matrix, which is the same as evaluating A^-1 * b for each column b of the matrix B... – Fredrik Johansson Mar 11 '13 at 06:45
  • I need to look into this possibility a little more. Do you have a good reference in mind? – user2153813 Mar 11 '13 at 14:47

3 Answers3

3

Linera algebra in mpmath is rather slow, unfortunately. There are many libraries that solve this problem much better (Sage for example). That said, as a followup to Stuart's suggestion, it is fairly easy to do reasonably fast high-precision matrix multiplication in Python without installing any libraries, using fixed-point arithmetic. Here is a version using mpmath matrices for input and output:

def fixmul(A, B, prec):
    m = A.rows; p = B.rows; n = B.cols;
    A = [[A[i,j].to_fixed(prec) for j in range(p)] for i in range(m)]
    B = [[B[i,j].to_fixed(prec) for j in range(n)] for i in range(p)]
    C = [([0] * n) for r in range(m)]
    for i in range(m):
        for j in range(n):
            s = 0
            for k in range(p):
                s += A[i][k] * B[k][j]
            C[i][j] = s
    return mp.matrix(C) * mpf(2)**(-2*prec)

At 256-bit precision, this multiplies two 200x200 matrices 16 times faster than mpmath for me. It is also not difficult to write a matrix inversion routine directly this way. Of course if the matrix entries are very large or very small, you want to rescale them first. A more solid solution would be to write your own matrix functions using the floating-point types in gmpy, which should be essentially as fast.

Fredrik Johansson
  • 25,490
  • 3
  • 25
  • 17
  • Thanks but the speedup I would need is significantly more. I need the inversion step to remain a fraction of a second even for matrices of size 100-200. The hybrid 128-bit solution I posted in the comments to Stuart does the inversion ~1/100 of a second for N=150 (while mpmath would take _much_ longer) at a gain of 3-4 decimal places precision. That may be enough though I wish float128 really was 128-bit. I guess I could implement the above in gmp or just using C and plug it into python. – user2153813 Mar 11 '13 at 15:05
  • Well, I time a 200x200 matrix multiplication at 128-bit precision to 4.4 seconds in Mathematica and 3.3 seconds in Sage (full matrix inverses costing roughly the same), while one matrix multiplication with the above code only takes 2.7 seconds. I would estimate that using GMP or MPFR in C will push that down to perhaps 0.5 to 1 seconds. If you need it to happen in a fraction of a second, your best is probably to look into double-double or quad-double arithmetic. – Fredrik Johansson Mar 11 '13 at 17:47
2

I'm assuming that double precision is not a problem for the precision of the final result, but for certain matrices that it's causing an issue in intermediate results of the inverse. In that case, lets treat the result of a normal numpy (double precision) inverse as merely a good approximation, and then use this as the starting point for a few iterations of newtons method to solve for the inverse.

Let A be the matrix we're trying to invert, and X be our estimate of the inverse. An iteration of Newton's method simply consists of:

X = X*(2I - AX)

For large matrices the effort to compute a few iterations of the above is almost insignificant compared with that of finding the inverse, and it could greatly improve the accuracy of your final result. Give this a try.

BTW, I is the identity matrix in the above equation.

EDIT to add code to test precision of floating types.

Use this code to test the precision of a float type.

x = float128('1.0')
wun = x
two = wun + wun
cnt = 1
while True:
   x = x/two
   y = wun + x
   if y<=wun: break
   cnt +=1

print 'The effective number of mantissa bits is', cnt
Stuart
  • 875
  • 6
  • 12
  • Well, I'm not so sure. The inverses numpy produces, whem multiplied by the original matrix, have zeros that are order 1e-15 or 1e-16 (as expected for double). I think this actually isn't good enough. The linear program involves swapping out a row of the matrix and re-evaluating the cost and what I'm finding is that at some point the change becomes so small the cost no longer changes (even though it should). I think this is a consequence of low precision. But maybe I can use your suggestion above to speed up mp.math's computation of an inverse. – user2153813 Mar 10 '13 at 21:29
  • It might be a bit faster than mp.A**(-1). I just tested it now on a random 200x200 matrix. The normal floating point numpy inv(A) was fast but as you indicated the mpmath equivalent was very slow. I then tested just one iteration of the above algorithm using mpmath (starting from the normal inv(A) result) and it improved mp.norm(A*X - mp.eye(200)) from about 1E-12 down to 1E-18. So it seems an improvement is possible. The only problem is, that with matrices that large, even just the mpmath matrix multiplications for the above iteration is quite slow (but still faster than the mpmath inverse) – Stuart Mar 10 '13 at 22:15
  • Thanks this is actually looking like a real possibility. The precision seems to improve very fast...even one iteration seems to get me to quite high precision. Unfortunately as you say mpmath multiplication is still quite slow but maybe i can find some other library that does fast high-precision matrix multiplication. Thanks for the suggestion! – user2153813 Mar 10 '13 at 23:08
  • 1
    So I combined this idea with the built-in float128 support in numpy to get an intermediate solution. On my machine float128 seems to support a precision of about 1e-19 or 1e-20 (about 1e3 or 1e4 more than double). Unfortunately numpy.linalg does not support float128 so I can't invert in 128 bit directly. But inverting in 64 bit and then converting to 128 bit and using Newton's method as suggested above seems to give a good compromise. For large matrices (~100 rows) when computing norm(A*x - I) I get 1e-16 (128-bit) instead of 1e-13 (64-bit answer). This is also _much_ faster than mpmath. – user2153813 Mar 11 '13 at 14:58
  • I should add all the above is with one iteration of newton's method. more iterations don't seem to increase precision substantially. – user2153813 Mar 11 '13 at 15:11
  • OK, that sounds like a good compromise.:) Yes 128 bit FP is still emulated on most hardware (no direct hardware support), but it's faster than the more general arbitrary precision routines. – Stuart Mar 11 '13 at 17:41
  • BTW. I just checked and it appears that under windows Numpy doesn't support float128. Under linux it's supported but isn't true quad precision, it's actually only 80 bits. The good part though is that this an FPU supported format, so not emulated. That seems pretty consistent with what you've observed too, going from 64bit floats to 80 bit theoretically increases the resolution from about 1e-16 to about 1e-19. – Stuart Mar 11 '13 at 18:53
  • Oh, good to know. Can you give me the ref for this? Float128 seems very undocumented on numpy. Any idea if there is anything closer to real 128 bit support? – user2153813 Mar 11 '13 at 21:24
  • See my edited post. I added some very simple code to test the precision of your floating types. Give it a try. It should return n=24 for float32, n=53 for float64, n=64 for the 80 bit extended type and n=112 if you've got a true float128. – Stuart Mar 12 '13 at 04:17
1

Multiprecision toolbox for MATLAB provides following timings using 128-bit precision (Core i7 930):

20x20 - 0.007 sec

30x30 - 0.019 sec

60x60 - 0.117 sec

200x200 - 3.2 sec

Note, these numbers is much lower for modern CPUs.