7

In a numerical solver I am working on in C, I need to invert a 2x2 matrix and it then gets multiplied on the right side by another matrix:

C = B . inv(A)

I have been using the following definition of an inverted 2x2 matrix:

a = A[0][0];
b = A[0][1];
c = A[1][0];
d = A[1][1];
invA[0][0] = d/(a*d-b*c);
invA[0][1] = -b/(a*d-b*c);
invA[1][0] = -c/(a*d-b*c);
invA[1][1] = a/(a*d-b*c);

In the first few iterations of my solver this seems to give the correct answers, however, after a few steps things start to grow and eventually explode.

Now, comparing to an implementation using SciPy, I found that the same math does not explode. The only difference I can find is that the SciPy code uses scipy.linalg.inv(), which internally uses LAPACK internally to perform the inversion.

When I replace the call to inv() with the above calculations the Python version does explode, so I'm pretty sure this is the problem. Small differences in the calculations are creeping in, which leads me to believe it is a numerical problem--not entirely surprising for an inversion operation.

I am using double-precision floats (64-bit), hoping that numerical issues would not be a problem, but apparently that is not the case.

But: I would like to solve this in my C code without needing to call out to a library like LAPACK, because the whole reason for porting it to pure C is to get it running on a target system. Moreover, I'd like to understand the problem, not just call out to a black box. Eventually I'd like to it run with single-precision too, if possible.

So, my question is, for such a small matrix, is there a numerically more stable way to calculate the inverse of A?

Thanks.

Edit: Currently trying to figure out if I can just avoid the inversion by solving for C.

Amro
  • 123,847
  • 25
  • 243
  • 454
Steve
  • 8,153
  • 9
  • 44
  • 91
  • In what sense does it explode? Numeric overflow? What types are matrix elements, what values are in? – fge Dec 31 '11 at 20:46
  • Answering fge's questions would help. There is also potential for division by zero here, which may cause your "explosion." – greg Dec 31 '11 at 20:49
  • Sorry if I wasn't clear, the explosion doesn't result directly from this operation, but from errors introduced by this operation into a feedback function that I haven't described here. – Steve Dec 31 '11 at 21:56

5 Answers5

7

Computing the determinant is not stable. A better way is to use Gauss-Jordan with partial pivoting, that you can work out explicitly easily here.

Solving a 2x2 system

Let us solve the system (use c, f = 1, 0 then c, f = 0, 1 to get the inverse)

a * x + b * y = c
d * x + e * y = f

In pseudo code, this reads

if a == 0 and d == 0 then "singular"

if abs(a) >= abs(d):
    alpha <- d / a
    beta <- e - b * alpha
    if beta == 0 then "singular"
    gamma <- f - c * alpha
    y <- gamma / beta
    x <- (c - b * y) / a
else
    swap((a, b, c), (d, e, f))
    restart

This is stabler than determinant + comatrix (beta is the determinant * some constant, computed in a stable way). You can work out the full pivoting equivalent (ie. potentially swapping x and y, so that the first division by a is such that a is the largest number in magnitude amongst a, b, d, e), and this may be stabler in some circumstances, but the above method has been working well for me.

This is equivalent to performing LU decomposition (store gamma, beta, a, b, c if you want to store this LU decomposition).

Computing the QR decomposition can also be done explicitly (and is also very stable provided you do it correctly), but it is slower (and involves taking square roots). The choice is yours.

Improving accuracy

If you need better accuracy (the above method is stable, but there is some roundoff error, proportional to the ratio of the eigenvalues), you can "solve for the correction".

Indeed, suppose you solved A * x = b for x with the above method. You now compute A * x, and you find that it does not quite equals b, that there is a slight error:

A * x - b = db

Now, if you solve for dx in A * dx = db, you have

A * (x - dx) = b + db - db - ddb = b - ddb

where ddb is the error induced by the numerical solving of A * dx = db, which is typically much smaller than db (since db is much smaller than b).

You can iterate the above procedure, but one step is typically required to restore full machine precision.

Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
  • Thank you for this; it is excellent! One quick comment: For my application, the "singular" case is relevant, and checking `beta == 0` can give the wrong answer once the values exceed ~30 bits of precision. Checking `a * e == b * d` is giving better results for me. – Nemo Apr 03 '23 at 21:09
6

Your code is fine; however, it risks loss of precision from any of the four subtractions.

Consider using more advanced techniques such as that used in matfunc.py. That code performs the inversion using a QR decomposition implemented with Householder reflections. The result is further improved with iterative refinement.

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
  • 1
    Yes, I figured it was due to division by a small number, but the subtractions are probably the worse element here! I'll check out that code, thank you very much. It will probably take a little while to translate to C. – Steve Dec 31 '11 at 21:22
6

Don't invert the matrix. Almost always, the thing you're using the inverse to accomplish can be done faster and more accurately without inverting the matrix. Matrix inversion is inherently unstable, and mixing that with floating point numbers is asking for trouble.

Saying C = B . inv(A) is the same as saying you want to solve AC = B for C. You can accomplish this by splitting up each B and C into two columns. Solving A C1 = B1 and A C2 = B2 will produce C.

redxaxder
  • 600
  • 2
  • 8
  • Seems to have done the trick, thank you! The system does seem quite a bit more stable, although I'll note that I was worried since I found that the solution still contains a division by a subtraction as in the formula for the inverse, which can be bad as mentioned by Raymond H. However, it seems to not cause the same divergence, so I suppose it is less prone to introducing errors. This makes some sense, since I am not needing to multiply by a very tiny reciprocal, but instead go directly to the final solution `C`. – Steve Jan 01 '12 at 01:02
  • I'll just post the solution here for posterity: `c00 = -(a11*b00-a10*b01) / (a01*a10-a00*a11); c01 = (a01*b00-a00*b01) / (a01*a10-a00*a11); c10 = (a10*b11-a11*b10) / (a01*a10-a00*a11); c11 = -(a00*b11-a01*b10) / (a01*a10-a00*a11); c20 = (a10*b21-a11*b20) / (a01*a10-a00*a11); c21 = -(a00*b21-a01*b20) / (a01*a10-a00*a11);` – Steve Jan 01 '12 at 01:03
  • Yikes, is there a better way to post the solution? Comments don't allow code blocks? Perhaps I should add it to redxaxder's answer? – Steve Jan 01 '12 at 01:06
1

I agree with Jean-Vicotr that you should probably use the Jacobbian methode. Here's my example:

#Helper functions:
def check_zeros(A,I,row, col=0):
"""
returns recursively the next non zero matrix row A[i]
"""
if A[row, col] != 0:
    return row
else:
    if row+1 == len(A):
        return "The Determinant is Zero"
    return check_zeros(A,I,row+1, col)

def swap_rows(M,I,row,index):
"""
swaps two rows in a matrix
"""
swap = M[row].copy()
M[row], M[index] = M[index], swap
swap = I[row].copy()
I[row], I[index] = I[index], swap

# Your Matrix M
M = np.array([[0,1,5,2],[0,4,9,23],[5,4,3,5],[2,3,1,5]], dtype=float)
I = np.identity(len(M))

M_copy = M.copy()
rows = len(M)

for i in range(rows):
index =check_zeros(M,I,i,i)
while index>i:
    swap_rows(M, I, i, index)
    print "swaped"
    index =check_zeros(M,I,i,i) 

I[i]=I[i]/M[i,i]
M[i]=M[i]/M[i,i]   

for j in range(rows):
    if j !=i:
        I[j] = I[j] - I[i]*M[j,i]
        M[j] = M[j] - M[i]*M[j,i]
print M
print I  #The Inverse Matrix
moldovean
  • 3,132
  • 33
  • 36
1

Use the Jacobi method, which is an iterative method that involves "inverting" only the main diagonal of A, which is very straightforward and less prone to numerical instability than inverting the whole matrix.