71

I am writing a program where I need to delete duplicate points stored in a matrix. The problem is that when it comes to check whether those points are in the matrix, MATLAB can't recognize them in the matrix although they exist.

In the following code, intersections function gets the intersection points:

[points(:,1), points(:,2)] = intersections(...
    obj.modifiedVGVertices(1,:), obj.modifiedVGVertices(2,:), ...
    [vertex1(1) vertex2(1)], [vertex1(2) vertex2(2)]);

The result:

>> points
points =
   12.0000   15.0000
   33.0000   24.0000
   33.0000   24.0000

>> vertex1
vertex1 =
    12
    15

>> vertex2    
vertex2 =
    33
    24

Two points (vertex1 and vertex2) should be eliminated from the result. It should be done by the below commands:

points = points((points(:,1) ~= vertex1(1)) | (points(:,2) ~= vertex1(2)), :);
points = points((points(:,1) ~= vertex2(1)) | (points(:,2) ~= vertex2(2)), :);

After doing that, we have this unexpected outcome:

>> points
points =
   33.0000   24.0000

The outcome should be an empty matrix. As you can see, the first (or second?) pair of [33.0000 24.0000] has been eliminated, but not the second one.

Then I checked these two expressions:

>> points(1) ~= vertex2(1)
ans =
     0
>> points(2) ~= vertex2(2)
ans =
     1   % <-- It means 24.0000 is not equal to 24.0000?

What is the problem?


More surprisingly, I made a new script that has only these commands:

points = [12.0000   15.0000
          33.0000   24.0000
          33.0000   24.0000];

vertex1 = [12 ;  15];
vertex2 = [33 ;  24];

points = points((points(:,1) ~= vertex1(1)) | (points(:,2) ~= vertex1(2)), :);
points = points((points(:,1) ~= vertex2(1)) | (points(:,2) ~= vertex2(2)), :);

The result as expected:

>> points
points =  
   Empty matrix: 0-by-2
rayryeng
  • 102,964
  • 22
  • 184
  • 193
Kamran Bigdely
  • 7,946
  • 18
  • 66
  • 86
  • 2
    @Kamran: Sorry I didn't point out the perils of floating point comparison when you asked about comparing values in your other question. It didn't immediately occur to me you might run into that problem. – gnovice Mar 26 '09 at 16:43
  • 1
    This has also been addressed [here](http://stackoverflow.com/questions/590822/dealing-with-accuracy-problems-in-floating-point-numbers/591120#591120) – ChrisF Mar 26 '09 at 16:28
  • 2
    As a side note, compare `1.2 - 0.2 - 1 == 0` and `1.2 - 1 - 0.2 == 0`. Surprising, isn't it? When you're dealing with floating-point numbers, the order of operations matters. – jub0bs Oct 12 '14 at 12:51
  • @TickTock - Your new title was not helpful at all. I apologize, but I've rolled back your edits... not to mention that the grammar was slightly poor... no offense. – rayryeng Aug 18 '16 at 07:22
  • @rayryeneng with respect to your perspective, we should look as an Googler eye, who one search something. He may use some keywords like mathematical or floating numbers but he practically never use 24.0000! This was my approch to edition I have made and I Think It is true(not speaking about grammatically errors may I have made). I want you to make some sort of edition that best sound and best searchable via search engines (how people write down their questions and google it) Also I thank from your mention. Bests – Seyfi Aug 18 '16 at 22:26
  • 1
    @Tick Tock: As the author of the question, I could not even understand the title you chose for my question. Also it did not reflect the fact that MATLAB does not show the entire floating point part of the number when you print out the variable. – Kamran Bigdely Aug 18 '16 at 22:34
  • @kami , suppose someone encounters with a similar issue , Ask yourself `How he/she could direct to your question?` .there are keywords in your question : 24.0000(he never use,practically) and "not equal" and "Matlab". I just change your title: 24.0000 and 24.0000 are mathematical equal,if you tell anyone ,they say these two numbers are equal in mathematics BUT in computer world they are usually not equal (in computer science each number that has a point in it like 1.2 3.65 and 24.0000 mention as a `floating point number`) so i combined these terms also include your "24.0000" mention. – Seyfi Aug 19 '16 at 11:27
  • @Tick Tock : you are right about the fact that the title is not google - friendly. Maybe we should change it to something like "why isn't a number equal to itself in matlab" or something like that. – Kamran Bigdely Aug 19 '16 at 14:54
  • @kami why isn' a FLOATING POINT number equal .... As you know integers are equal . – Seyfi Aug 19 '16 at 15:42
  • Please leave the title intact. All of us who regularly answer questions in MATLAB use the current title as a means of searching for this question. No changes are necessary. Thank you for considering it though kami. – rayryeng Aug 19 '16 at 22:58
  • Possible duplicate of [Best Practice for Float Comparison in Matlab](http://stackoverflow.com/questions/23824577/best-practice-for-float-comparison-in-matlab) – m7913d May 04 '17 at 15:20
  • @m7913d: This question was asked more than 8 years ago but that question was asked two years ago. So you should put the duplicate note on that question not this one! – Kamran Bigdely May 04 '17 at 16:06
  • @kami I know, but I think it may be useful for people with the same problem to show that both questions are related. The flag was _not_ to blame you, but to help other people. – m7913d May 04 '17 at 16:16
  • 1
    @m7913d, I see. but usually they put the 'duplicate' label on the newer question. Please read the rules for duplicate label: https://meta.stackexchange.com/questions/10841/how-should-duplicate-questions-be-handled – Kamran Bigdely May 04 '17 at 17:28
  • 1
    @m7913d: from duplicate rule explanation: "Usually a recent question will be closed as a duplicate of an older question." – Kamran Bigdely May 04 '17 at 17:31

6 Answers6

100

The problem you're having relates to how floating-point numbers are represented on a computer. A more detailed discussion of floating-point representations appears towards the end of my answer (The "Floating-point representation" section). The TL;DR version: because computers have finite amounts of memory, numbers can only be represented with finite precision. Thus, the accuracy of floating-point numbers is limited to a certain number of decimal places (about 16 significant digits for double-precision values, the default used in MATLAB).

Actual vs. displayed precision

Now to address the specific example in the question... while 24.0000 and 24.0000 are displayed in the same manner, it turns out that they actually differ by very small decimal amounts in this case. You don't see it because MATLAB only displays 4 significant digits by default, keeping the overall display neat and tidy. If you want to see the full precision, you should either issue the format long command or view a hexadecimal representation of the number:

>> pi
ans =
    3.1416
>> format long
>> pi
ans =
   3.141592653589793
>> num2hex(pi)
ans =
400921fb54442d18

Initialized values vs. computed values

Since there are only a finite number of values that can be represented for a floating-point number, it's possible for a computation to result in a value that falls between two of these representations. In such a case, the result has to be rounded off to one of them. This introduces a small machine-precision error. This also means that initializing a value directly or by some computation can give slightly different results. For example, the value 0.1 doesn't have an exact floating-point representation (i.e. it gets slightly rounded off), and so you end up with counter-intuitive results like this due to the way round-off errors accumulate:

>> a=sum([0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1]);  % Sum 10 0.1s
>> b=1;                                               % Initialize to 1
>> a == b
ans =
  logical
   0                % They are unequal!
>> num2hex(a)       % Let's check their hex representation to confirm
ans =
3fefffffffffffff
>> num2hex(b)
ans =
3ff0000000000000

How to correctly handle floating-point comparisons

Since floating-point values can differ by very small amounts, any comparisons should be done by checking that the values are within some range (i.e. tolerance) of one another, as opposed to exactly equal to each other. For example:

a = 24;
b = 24.000001;
tolerance = 0.001;
if abs(a-b) < tolerance, disp('Equal!'); end

will display "Equal!".

You could then change your code to something like:

points = points((abs(points(:,1)-vertex1(1)) > tolerance) | ...
                (abs(points(:,2)-vertex1(2)) > tolerance),:)

Floating-point representation

A good overview of floating-point numbers (and specifically the IEEE 754 standard for floating-point arithmetic) is What Every Computer Scientist Should Know About Floating-Point Arithmetic by David Goldberg.

A binary floating-point number is actually represented by three integers: a sign bit s, a significand (or coefficient/fraction) b, and an exponent e. For double-precision floating-point format, each number is represented by 64 bits laid out in memory as follows:

enter image description here

The real value can then be found with the following formula:

enter image description here

This format allows for number representations in the range 10^-308 to 10^308. For MATLAB you can get these limits from realmin and realmax:

>> realmin
ans =
    2.225073858507201e-308
>> realmax
ans =
    1.797693134862316e+308

Since there are a finite number of bits used to represent a floating-point number, there are only so many finite numbers that can be represented within the above given range. Computations will often result in a value that doesn't exactly match one of these finite representations, so the values must be rounded off. These machine-precision errors make themselves evident in different ways, as discussed in the above examples.

In order to better understand these round-off errors it's useful to look at the relative floating-point accuracy provided by the function eps, which quantifies the distance from a given number to the next largest floating-point representation:

>> eps(1)
ans =
     2.220446049250313e-16
>> eps(1000)
ans =
     1.136868377216160e-13

Notice that the precision is relative to the size of a given number being represented; larger numbers will have larger distances between floating-point representations, and will thus have fewer digits of precision following the decimal point. This can be an important consideration with some calculations. Consider the following example:

>> format long              % Display full precision
>> x = rand(1, 10);         % Get 10 random values between 0 and 1
>> a = mean(x)              % Take the mean
a =
   0.587307428244141
>> b = mean(x+10000)-10000  % Take the mean at a different scale, then shift back
b =
   0.587307428244458

Note that when we shift the values of x from the range [0 1] to the range [10000 10001], compute a mean, then subtract the mean offset for comparison, we get a value that differs for the last 3 significant digits. This illustrates how an offset or scaling of data can change the accuracy of calculations performed on it, which is something that has to be accounted for with certain problems.

Kamran Bigdely
  • 7,946
  • 18
  • 66
  • 86
gnovice
  • 125,304
  • 15
  • 256
  • 359
  • why can't I see that small decimal amount? – Kamran Bigdely Mar 26 '09 at 16:18
  • 2
    you can see it if you view the variable in the matrix view. Right click on variable -> "View selection" or something? I don't have MATLAB here, so I can't check. – atsjoo Mar 26 '09 at 16:20
  • 5
    You can also see small differences by typing "format long" at the command prompt. – gnovice Mar 26 '09 at 16:23
  • matlab has about 16 digits of precision... only displays 5 unless you do the above – jle Mar 26 '09 at 16:26
  • 2
    you are right: format long points = 12.000000000000000 15.000000000000000 33.000000000000000 23.999999999999996 33.000000000000000 24.000000000000000 – Kamran Bigdely Mar 26 '09 at 20:02
  • 7
    "format hex" can sometimes help even more than format long here. – Sam Roberts Oct 05 '09 at 15:25
  • It may be useful to provide a link to [Best Practice for Float Comparison in Matlab](http://stackoverflow.com/questions/23824577/best-practice-for-float-comparison-in-matlab). – m7913d May 04 '17 at 15:19
23

Look at this article: The Perils of Floating Point. Though its examples are in FORTRAN it has sense for virtually any modern programming language, including MATLAB. Your problem (and solution for it) is described in "Safe Comparisons" section.

Rorick
  • 8,857
  • 3
  • 32
  • 37
  • 1
    I discovered it some time ago and was very impressed with it =) Now I always recommend it in similar situations. – Rorick Mar 27 '09 at 08:26
  • [Archived version](https://web.archive.org/web/20180712123630/http://www.lahey.com/float.htm) of this excellent resource! – wizclown Jul 12 '18 at 12:37
13

type

format long g

This command will show the FULL value of the number. It's likely to be something like 24.00000021321 != 24.00000123124

KitsuneYMG
  • 12,753
  • 4
  • 37
  • 58
7

Try writing

0.1 + 0.1 + 0.1 == 0.3.

Warning: You might be surprised about the result!

Andrey Rubshtein
  • 20,795
  • 11
  • 69
  • 104
  • I tried it and it returns 0. But I don't see what it has to do, with the problem above. Can you pls explain it to me? – Max Sep 16 '15 at 08:46
  • 6
    This is because 0.1 comes with some floating point error, and when you add three such terms together, the errors do not necessarily add up to 0. The same issue is causing (floating) 24 to not be exactly equal to (another floating) 24. – Derek Mar 04 '16 at 11:14
2

Maybe the two numbers are really 24.0 and 24.000000001 but you're not seeing all the decimal places.

Jimmy J
  • 1,953
  • 1
  • 14
  • 20
1

Check out the Matlab EPS function.

Matlab uses floating point math up to 16 digits of precision (only 5 are displayed).

Dima Chubarov
  • 16,199
  • 6
  • 40
  • 76
jle
  • 9,316
  • 5
  • 48
  • 67