0

Suppose we want to calculate (a + b)2 in two different ways, that is

  1. (a + b) * (a + b)

  2. a2 + 2 a b + b2

Now, suppose a = 1.4 and b = -2.7. If we plug those two numbers in the formulas with format long we obtain in both cases 1.690000000000001, that is, if I run the following script:

a = 1.4; 
b = -2.7;

format long

r = (a + b) * (a + b)

r2 = a^2 + 2*a*b + b^2

abs_diff = abs(r - r2)

I obtain

r = 1.690000000000001

r2 = 1.690000000000001

abs_diff = 6.661338147750939e-16

What's going on here? I could preview different results for r or r2 (because Matlab would be executing different floating-point operations), but not for the absolute value of their difference.

I also noticed that the relative error of r and r2 are different, that is, if I do

rel_err1 = abs(1.69 - r) / 1.69
rel_err2 = abs(1.69 - r2) / 1.69

I obtain

rel_err1 = 3.941620205769786e-16

rel_err2 = 7.883240411539573e-16

This only makes me think that r are not actually the same r2. Is there a way to see them completely then, if they are really different? If not, what's happening?

Also, both relative errors are not less than eps / 2, does this mean that an overflow has happened? If yes, where?


Note: This is a specific case. I understood that we're dealing with floating-point numbers and rounding errors. But I would like a to understand better them by going through this example.

nbro
  • 15,395
  • 32
  • 113
  • 196
  • 1
    It's likely due to the fact that floating point errors depends on the *order*, *type* and *number* of operations. Your two methods have different number and types of operations. The one with [more additions theoretically introduces](http://floating-point-gui.de/errors/propagation/) more floating point errors. Also more operations causes more propagation of the error. – Suever Mar 18 '16 at 17:58
  • @Suever My intuition was that since both methods are actually implemented differently, i.e. using different floating-point operations, then we would obtain different results, and this would explain that their difference is different from 0 (anyway). But Matlab apparently represents `r` and `r2` in the same way, so their difference should be 0... – nbro Mar 18 '16 at 18:02
  • what in your post demonstrates that MATLAB represents `r` and `r2` in the same way? – Suever Mar 18 '16 at 18:04
  • @Suever The fact that `r` and `r2` are both represented as `1.690000000000001` – nbro Mar 18 '16 at 18:05
  • 3
    Simple answer: your two numbers are *not* equal. The long format is only giving you 16 significant digits, which isn't enough to distinguish these two numbers. Their exact values are `1.6900000000000006128431095930864103138446807861328125` and `1.690000000000001278976924368180334568023681640625`, both of which are exactly representable as IEEE 754 binary floats, and both of which display as `1.690000000000001` when rounded to 16 significant (decimal) digits. – Mark Dickinson Mar 18 '16 at 18:05
  • @MarkDickinson Thanks, you went to the point. But is there a way in Matlab to show more digits? Because indeed it's misleading... – nbro Mar 18 '16 at 18:06
  • @nbro: No idea, I'm afraid. I'm not a MATLAB user, just a floating-point enthusiast. – Mark Dickinson Mar 18 '16 at 18:09
  • 2
    `format hex` will show you the actual binary representation of the numbers. In this case, they are 3ffb0a3d70a3d70d and 3ffb0a3d70a3d710 for r and r2 respectively. – Kyler Brown Mar 18 '16 at 18:14
  • @MarkDickinson Pretty much the same way you would anywhere else: `fprintf("%.60f\n",r)` – beaker Mar 18 '16 at 18:15
  • @beaker: Good to know; thanks. In that case, a format of `"%.17g"` should be enough to distinguish any two distinct IEEE 754 doubles. 16 significant digits is *almost* enough, but it's one digit short for the purposes of giving a faithful representation. – Mark Dickinson Mar 18 '16 at 18:17
  • @MarkDickinson Also good to know. Thanks :) – beaker Mar 18 '16 at 18:19
  • @beaker Thanks, and feel free to add it as an answer, 'cause it's more visible also for future users ;) – nbro Mar 18 '16 at 18:19
  • nbro, you could write the answer and incorporate @KylerBrown's suggestion as well. – beaker Mar 18 '16 at 18:20
  • 1
    @beaker Ok, maybe I will give an answer that englobes what you guys have been saying if nobody decides to do it ;) – nbro Mar 18 '16 at 18:22
  • 1
    To track floating point issues, format long is not sufficient. You must print 17 digits http://stackoverflow.com/a/35626253/2732801 . Maybe that helps to understand the behaviour. – Daniel Mar 18 '16 at 18:24
  • @MarkDickinson How would you explain the fact that the absolute error (for both `r` and `r2`) is greater than eps / 2 (that is the machine precision divided by 2)? – nbro Mar 18 '16 at 20:04
  • @nbro: Why shouldn't it be? Every arithmetic operation you perform might introduce some error, and the computation of `r` (or `r2`) involves several operations: the errors accumulate. And the bound on the error for a single operation is more easily expressed in terms of relative error rather than absolute; even then, it's not as simple as `eps / 2`: you need to take the quantization of the binary format into account. If you want to do some reading, you might try the [Handbook of Floating Point Arithmetic](http://www.springer.com/jp/book/9780817647049). – Mark Dickinson Mar 18 '16 at 20:10
  • @MarkDickinson Ok, thanks! I don't think I will have the time soon, but maybe in the future I will have a look at it ;) – nbro Mar 18 '16 at 20:23

2 Answers2

3

Don't rely on the output of format long to conclude that two numbers are equal...

a = 1.4;
b = -2.7
r1 = (a + b) * (a + b);
r2 = a^2 + 2*a*b + b^2;
r3 = (a+b)^2;

Instead you can check their hex representation using:

>> num2hex([r1 r2 r3])
ans =
3ffb0a3d70a3d70d
3ffb0a3d70a3d710
3ffb0a3d70a3d70d

or the printf family of functions:

>> fprintf('%bx\n', r1, r2, r3)
3ffb0a3d70a3d70d
3ffb0a3d70a3d710
3ffb0a3d70a3d70d

or even:

>> format hex
>> disp([r1; r2; r3])
   3ffb0a3d70a3d70d
   3ffb0a3d70a3d710
   3ffb0a3d70a3d70d
Amro
  • 123,847
  • 25
  • 243
  • 454
  • What about the suggestion of using `fprintf` to print a floating-point number with as many decimal digits as we want, e.g., `fprintf('%.60f', my_float);`? – nbro Mar 19 '16 at 01:22
  • I would always trust the hex representation more because that's the actual 1's and 0's stored, no rounding involved... Not to mention, how far will you go with `fprintf` if `my_float = realmin` for example? – Amro Mar 19 '16 at 01:23
1

Floating point arithmetic is not associative.

While mathematically these two are equal, they are not in floating point maths.

r = (a + b) * (a + b)

r2 = a^2 + 2*a*b + b^2

The order operations are executed in floating point maths is very relevant. That is why when you do floating point maths you need to be very careful of the order of you r multiplications/divisions, specially when working with very big numbers together with really small numbers.

Ander Biguri
  • 35,140
  • 11
  • 74
  • 120
  • Ok, but if `r` and `r2` are represent in memory apparently with the same number, their absolute difference would be equal to zero... – nbro Mar 18 '16 at 18:04