2

Why inv(A)*A is not exact identity matrix? All the diagonal elements are correct but rest are not. I learnt that this is residual error, then how to deal with it?

CODE:

A = [1,2,0;0,5,6;7,0,9]
A_inv = inv(A)
A_invA = inv(A)*A

OUTPUT:

code

  • 3
    You just have to know that there are rounding errors in all computations. You cannot do equality comparisons without taking some tolerance into account. See https://stackoverflow.com/q/686439/7328782 – Cris Luengo Jun 04 '21 at 12:50
  • 3
    Please post code as (formatted) text, not as an image – Luis Mendo Jun 04 '21 at 13:44
  • @LuisMendo I have edited the question and included both. Is it fine or should I stick to code only? – Rishabh Kumar Singh Jun 04 '21 at 16:00
  • It's better to use text for code, because that way we can copy-paste and reproduce your results. The image is usually not needed, unless you want to call attention to a specific aspect of the displayed output. In this case you can leave the image to lillustrate the output, or paste the output as code-formatted text – Luis Mendo Jun 04 '21 at 16:25
  • If you used more precision for the output you might well see that the diagonal elements are not precisely 1.0 – dmuir Jun 05 '21 at 13:40
  • @dmuir I understand now. Thank you. – Rishabh Kumar Singh Jun 06 '21 at 12:25

1 Answers1

3

An exploration of the documentation of inv leads you down the following path, which answers your question nicely (emphases mine):

octave:1> help inv
'inv' is a built-in function from the file libinterp/corefcn/inv.cc

 -- X = inv (A)
 -- [X, RCOND] = inv (A)
 -- [...] = inverse (...)
     Compute the inverse of the square matrix A.

     Return an estimate of the reciprocal condition number if requested,
     otherwise warn of an ill-conditioned matrix if the reciprocal
     condition number is small.

     In general it is best to avoid calculating the inverse of a matrix
     directly.  For example, it is both faster and more accurate to
     solve systems of equations (A*x = b) with 'Y = A \ b', rather than
     'Y = inv (A) * b'.

In your particular case, you will see that:

A = [1,2,0;0,5,6;7,0,9];
[X, RCOND] = inv(A);
RCOND
% RCOND = 0.070492

So, what does this value mean? You can find the answer in the relevant function rcond, which calculates this value directly:

octave:2> help rcond
'rcond' is a built-in function from the file libinterp/corefcn/rcond.cc

 -- C = rcond (A)
     Compute the 1-norm estimate of the reciprocal condition number as
     returned by LAPACK.

     If the matrix is well-conditioned then C will be near 1 and if the
     matrix is poorly conditioned it will be close to 0.

     [...]

     See also: cond, condest.

Your value is 0.07, which is quite close to 0, therefore your A matrix is rather poorly conditioned.

To learn a bit more what "poorly conditioned" means exactly, we can have a look at the cond function:

octave:26> help cond
'cond' is a function from the file /opt/octave-6.2.0/share/octave/6.2.0/m/linear-algebra/cond.m

 -- cond (A)
 -- cond (A, P)
     Compute the P-norm condition number of a matrix with respect to
     inversion.

     'cond (A)' is defined as 'norm (A, P) * norm (inv (A), P)'.

     [...]

     The condition number of a matrix quantifies the sensitivity of the
     matrix inversion operation when small changes are made to matrix
     elements.  Ideally the condition number will be close to 1.  When
     the number is large this indicates small changes (such as underflow
     or round-off error) will produce large changes in the resulting
     output.  In such cases the solution results from numerical
     computing are not likely to be accurate.

In your case:

cond(A,2)
% ans = 7.080943875445246

So there you have it. Your matrix is relatively poorly-conditioned, which means that its inversion is more susceptible to precision errors. You may get better results if you use the mldivide (i.e. \ operator) instead.

Tasos Papastylianou
  • 21,371
  • 2
  • 28
  • 57
  • Thank you. I got most of it. I just want to ask: 1. Why this `mldivide` is not facing same issues? 2. Is there going to be any difference if I use `pinv` instead of `inv` or same problem of poorly-conditioned matrix? 3. What is actually meant by `poorly-conditioned matrix`? – Rishabh Kumar Singh Jun 14 '21 at 08:24
  • 1
    @RishabhKumarSingh Let's start in reverse order. 3) To rephrase what is already present in the answer: A matrix `A` with inverse `B` is poorly conditioned if `A` has the property that small changes (or 'perturbations') to it produce large changes in `B`. If the perturbations are of the order of machine precision, this means that you are reaching the limit at which the machine can represent small changes, and therefore the 'true' inverse, which is susceptible to these small changes, may be quite different from the one calculated at the chosen precision. – Tasos Papastylianou Jun 14 '21 at 15:53
  • 1
    2) I am not 100% sure if precision issues affect pinv in the same manner, but I would assume the answer is yes, since the issue is unlikely to relate to implementation differences, and more to do with underlying precision of your cpu. – Tasos Papastylianou Jun 14 '21 at 15:55
  • 1
    1) It's not that `mldivide` is not susceptible to precision issues in general. What octave is telling you above is that if you calculate inv first (which can result in large errors when dealing with poorly conditioned matrices), and _then_ use this result to solve a system of equations, you are 'propagating' that large error to the second calculation, whereas `mldivide` uses a different technique to solve the system of equations, which does not rely on the calculation of an inverse in the same manner, and therefore avoids introducing such large errors to the result. – Tasos Papastylianou Jun 14 '21 at 15:59
  • 1
    Thank you so much. Regarding point 2, I am actually attending ML course by Andrew Ng. And there was a point where he said: `When implementing the normal equation in octave we want to use the 'pinv' function rather than 'inv.' The 'pinv' function will give you a value of θ even if (X'*X) is not invertible.` where `θ = pinv(X'*X) * X' * y`. – Rishabh Kumar Singh Jun 15 '21 at 05:33
  • 1
    @RishabhKumarSingh indeed! But this is a mathematical property of of the pinv, it has nothing to do with precision. If inv exists, then pinv will necessarily be equal to inv. And therefore, if inv cannot be derived properly at some machine precision, then I assume neither will pinv. However, if inv does not exist, pinv will still give a result: this is because mathematically what pinv achieves is akin to a 'least squares minimization' process: i.e. the resulting matrix minimizes some sort of distance. If an inv exists, that distance is minimized to exactly zero and the result is equal to inv. – Tasos Papastylianou Jun 15 '21 at 09:39
  • I am marking this answer as accepted. Thank you for all the help. I am grateful. – Rishabh Kumar Singh Jun 16 '21 at 05:22