2

I prepare a matrix of random numbers, calculate its inverse and matrix multiply it with the original matrix. This, in theory, gives the unit matrix. How can I let numpy do that for me?

import numpy

A = numpy.zeros((100,100))
E = numpy.zeros((100,100))

size = 100

for i in range(size):
    for j in range(size):
        A[i][j]+=numpy.random.randint(10)
        if i == j:
            E[i][j]+=1

A_inv = numpy.linalg.linalg.inv(A)
print numpy.dot(A, A_inv)

Running the code produces

[me]machine @ numeric $ python rand_diag.py 
[[  1.00000000e+00  -7.99360578e-15  -1.14491749e-16 ...,   3.81639165e-17
   -4.42701431e-15   1.17961196e-15]
[ -5.55111512e-16   1.00000000e+00  -2.22044605e-16 ...,  -3.88578059e-16
    1.33226763e-15  -8.32667268e-16]

It's evident the result is a unit matrix, but not precisely, so print numpy.dot(A, A_inv) == E evidently gives False. I'm doing this for practicing linear algebra and trying to find the size of the matrix for which my machine arrives at its limits. Getting a True would be didactically appealing.

Edit:

Setting size=10000, I run out of memory

[me]machine @ numeric $ Python(794) malloc:
***mmap(size=800002048) failed (error code=12)
*** error: can\'t allocate region
*** set a breakpoint in malloc_error_break to debug
Traceback (most recent call last):
File "rand_diag.py", line 14, in <module>     A_inv = numpy.linalg.linalg.inv(A)
File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 445, in inv
return wrap(solve(a, identity(a.shape[0], dtype=a.dtype)))
File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 323, in solve
a, b = _fastCopyAndTranspose(t, a, b)
File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 143, in _fastCopyAndTranspose
cast_arrays = cast_arrays + (_fastCT(a),)
MemoryError

[1]+  Exit 1                  python rand_diag.py

How can I allocate more memory and how can I run this in parallel (I have 4 cores)?

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
TMOTTM
  • 3,286
  • 6
  • 32
  • 63
  • That's not what [others](http://stackoverflow.com/questions/3890621/matrix-multiplication-in-numpy) recommend. – TMOTTM Mar 29 '13 at 16:29
  • 3
    It is completely unclear how multiplying a matrix by its inverse helps “to find the size of the matrix for which my machine arrives at its limits”. What are you really trying to do? Do you mean memory limit, limit at which floating-point rounding errors become intolerable, execution time limit, or something else? If you want an identity matrix, use `numpy.identity(n)`. – Eric Postpischil Mar 29 '13 at 16:46
  • 5
    Getting `True` would be very didactically unappealing, because it would cause the student to confuse finite precision floating point arithmetic with infinite precision real number arithmetic. A student of computer approximations to linear algebra needs to learn to be constantly aware of that difference, and to expect its consequences. – Patricia Shanahan Mar 29 '13 at 19:49
  • Your points are both very valid I admit. @EricPostpischil: Currently, I'm observing 100% CPU load while running the inversion of a `size=10000` matrix. Memory is still quite available, if I knew how to, I would assign all four CPU's of my machine to the script and give it all the memory it needs. There's a lot I'd like to figure out more clearly here... – TMOTTM Mar 29 '13 at 19:59
  • @PatriciaShanahan: I'm currently most interested in, let's call it canonical linear algebra, so all I need are some illustrative calculations to get a picture of what's happening. Computational details are of concern later but are not to be left unattended. – TMOTTM Mar 29 '13 at 20:02
  • @TMOTTM: If this question is answered to your satisfaction, you should accept the best answer by [clicking the green checkmark next to it](http://stackoverflow.com/faq#howtoask). – askewchan Mar 30 '13 at 19:15

4 Answers4

8

While getting True would be didactically appealing, it would also be divorced from the realities of floating-point computations.

When dealing with the floating point, one necessarily has to be prepared not only for inexact results, but for all manner of other numerical issues that arise.

I highly recommend reading What Every Computer Scientist Should Know About Floating-Point Arithmetic.

In your particular case, to ensure that A * inv(A) is close enough to the identity matrix, you could compute a matrix norm of numpy.dot(A, A_inv) - E and ensure that it is small enough.

As a side note, you don't have to use a loop to populate A and E. Instead, you could just use

A = numpy.random.randint(0, 10, (size,size))
E = numpy.eye(size)
NPE
  • 486,780
  • 108
  • 951
  • 1,012
  • Note that the matrix norm you propose is similar to the RMS value I compute in my answer - the difference being that I then divide by the size of the matrix to get a sense of relative scale. +1 for the very helpful references! – Floris Mar 29 '13 at 17:21
  • +1 for the short cut for the matrix population, I was waiting for someone to point that out. – TMOTTM Mar 29 '13 at 20:05
4

Agreeing with most of the points already made. However I would suggest that rather than looking at the individual off-diagonal elements, you take their rms sum; this reflects in some sense the "energy" that leaked into the off-diagonal terms as a result of imperfect calculations. If you then divide this RMS number by the sum of the diagonal terms, you get a metric of just how well the inverse worked. For example, the following code:

import numpy
import matplotlib.pyplot as plt
from numpy import mean, sqrt
N = 1000
R = numpy.zeros(N)

for size in range(50,N,50):

  A = numpy.zeros((size, size))
  E = numpy.zeros((size, size))

  for i in range(size):
      for j in range(size):
          A[i][j]+=numpy.random.randint(10)
          if i == j:
              E[i][j]=1

  A_inv = numpy.linalg.linalg.inv(A)
  D = numpy.dot(A, A_inv) - E
  S = sqrt(mean(D**2))
  R[size] = S/size
  print "size: ", size, "; rms is ",  S/size

plt.plot(range(50,N,50), R[range(50, N, 50)])
plt.ylabel('RMS fraction')
plt.show()

Shows that the rms error is pretty stable with size of the array all the way up to a size of 950x950 (it does slow down a bit...). However, it's never "exact", and there are some outliers (presumably when the matrix is more nearly singular - this can happen with random matrices.)

Example plot (every time you run it, it will look a bit different):

enter image description here

Floris
  • 45,857
  • 6
  • 70
  • 122
2

In the end, you can round your answer with

m = np.round(m, decimals=10)

or check to see if they're very different:

np.abs(A*A.I - i).mean() < 1e-10

if you want to kill off the tiny numbers.


I would implement this with the numpy.matrix class.

import numpy

size = 100
A = numpy.matrix(numpy.random.randint(0,10,(size,)*2))
E = numpy.eye(size)

print A * A.I
print np.abs(A * A.I - E).mean() < 1e-10
askewchan
  • 45,161
  • 17
  • 118
  • 134
  • whats the reason for using the matrix class? – TMOTTM Mar 29 '13 at 16:33
  • It's limited, but much easier to use and read for simple 2d linear algebra, see my edit for example. – askewchan Mar 29 '13 at 16:35
  • Or, @TMOTTM, at least use arrays instead of looping through it all. You can use everything I did in my example with arrays instead of matrices if you prefer that (`np.identity` gives an array form of `np.eye`) except of course the matrix algebra at the end, but you can use `np.linalg` there, as you did. – askewchan Mar 29 '13 at 17:20
0

Your problem can be reduced to a common float-comparison problem. The correct way to compare such arrays would be:

EPS = 1e-8  # for example
(np.abs(numpy.dot(A, A_inv) - E) < EPS).all()
shx2
  • 61,779
  • 13
  • 130
  • 153