How should I do floating point comparison?

Question

I'm currently writing some code where I have something along the lines of:

double a = SomeCalculation1();
double b = SomeCalculation2();

if (a < b)
    DoSomething2();
else if (a > b)
    DoSomething3();

And then in other places I may need to do equality:

double a = SomeCalculation3();
double b = SomeCalculation4();

if (a == 0.0)
   DoSomethingUseful(1 / a);
if (b == 0.0)
   return 0; // or something else here

In short, I have lots of floating point math going on and I need to do various comparisons for conditions. I can't convert it to integer math because such a thing is meaningless in this context.

I've read before that floating point comparisons can be unreliable, since you can have things like this going on:

double a = 1.0 / 3.0;
double b = a + a + a;
if ((3 * a) != b)
    Console.WriteLine("Oh no!");

In short, I'd like to know: How can I reliably compare floating point numbers (less than, greater than, equality)?

The number range I am using is roughly from 10E-14 to 10E6, so I do need to work with small numbers as well as large.

I've tagged this as language agnostic because I'm interested in how I can accomplish this no matter what language I'm using.

There is no way to do this reliably when using floating point numbers. There will always be numbers that for the computer are equal though in reality are not (say 1E+100, 1E+100+1), and you will also usually have calculation results that to the computer are not equal though in reality are (see one of the comments to nelhage's answer). You will have to choose which of the two you desire less. — toochin, Feb 06 '11 at 20:21
On the other hand, if you, say, only deal with rational numbers, you might implement some rational number arithmetic based on integer numbers and then two numbers are considered equal if one of the two numbers can be cancelled down to the other one. — toochin, Feb 06 '11 at 20:24
Well, currently I'm working a simulation. The place I'm usually doing these comparisons is related to variable time steps (for solving some ode). There's a few instances where I need to check if the given time step for one object is equal to, less than, or greater than another object's time step. — Mike Bailey, Feb 06 '11 at 20:27
[Most effective way for float and double comparison](http://stackoverflow.com/q/17333/995714) — phuclv, Apr 03 '16 at 09:40

P-Gn · Answer 1 · 2021-02-22T09:48:26.877

TL;DR

Use the following function instead of the currently accepted solution to avoid some undesirable results in certain limit cases, while being potentially more efficient.
Know the expected imprecision you have on your numbers and feed them accordingly in the comparison function.

bool nearly_equal(
  float a, float b,
  float epsilon = 128 * FLT_EPSILON, float abs_th = FLT_MIN)
  // those defaults are arbitrary and could be removed
{
  assert(std::numeric_limits<float>::epsilon() <= epsilon);
  assert(epsilon < 1.f);

  if (a == b) return true;

  auto diff = std::abs(a-b);
  auto norm = std::min((std::abs(a) + std::abs(b)), std::numeric_limits<float>::max());
  // or even faster: std::min(std::abs(a + b), std::numeric_limits<float>::max());
  // keeping this commented out until I update figures below
  return diff < std::max(abs_th, epsilon * norm);
}

Graphics, please?

When comparing floating point numbers, there are two "modes".

The first one is the relative mode, where the difference between x and y is considered relatively to their amplitude |x| + |y|. When plot in 2D, it gives the following profile, where green means equality of x and y. (I took an epsilon of 0.5 for illustration purposes).

The relative mode is what is used for "normal" or "large enough" floating points values. (More on that later).

The second one is an absolute mode, when we simply compare their difference to a fixed number. It gives the following profile (again with an epsilon of 0.5 and a abs_th of 1 for illustration).

This absolute mode of comparison is what is used for "tiny" floating point values.

Now the question is, how do we stitch together those two response patterns.

In Michael Borgwardt's answer, the switch is based on the value of diff, which should be below abs_th (Float.MIN_NORMAL in his answer). This switch zone is shown as hatched in the graph below.

Because abs_th * epsilon is smaller that abs_th, the green patches do not stick together, which in turn gives the solution a bad property: we can find triplets of numbers such that x < y_1 < y_2 and yet x == y2 but x != y1.

Take this striking example:

x  = 4.9303807e-32
y1 = 4.930381e-32
y2 = 4.9309825e-32

We have x < y1 < y2, and in fact y2 - x is more than 2000 times larger than y1 - x. And yet with the current solution,

nearlyEqual(x, y1, 1e-4) == False
nearlyEqual(x, y2, 1e-4) == True

By contrast, in the solution proposed above, the switch zone is based on the value of |x| + |y|, which is represented by the hatched square below. It ensures that both zones connects gracefully.

Also, the code above does not have branching, which could be more efficient. Consider that operations such as max and abs, which a priori needs branching, often have dedicated assembly instructions. For this reason, I think this approach is superior to another solution that would be to fix Michael's nearlyEqual by changing the switch from diff < abs_th to diff < eps * abs_th, which would then produce essentially the same response pattern.

Where to switch between relative and absolute comparison?

The switch between those modes is made around abs_th, which is taken as FLT_MIN in the accepted answer. This choice means that the representation of float32 is what limits the precision of our floating point numbers.

This does not always make sense. For example, if the numbers you compare are the results of a subtraction, perhaps something in the range of FLT_EPSILON makes more sense. If they are squared roots of subtracted numbers, the numerical imprecision could be even higher.

It is rather obvious when you consider comparing a floating point with 0. Here, any relative comparison will fail, because |x - 0| / (|x| + 0) = 1. So the comparison needs to switch to absolute mode when x is on the order of the imprecision of your computation -- and rarely is it as low as FLT_MIN.

This is the reason for the introduction of the abs_th parameter above.

Also, by not multiplying abs_th with epsilon, the interpretation of this parameter is simple and correspond to the level of numerical precision that we expect on those numbers.

Mathematical rumbling

(kept here mostly for my own pleasure)

More generally I assume that a well-behaved floating point comparison operator =~ should have some basic properties.

The following are rather obvious:

self-equality: a =~ a
symmetry: a =~ b implies b =~ a
invariance by opposition: a =~ b implies -a =~ -b

(We don't have a =~ b and b =~ c implies a =~ c, =~ is not an equivalence relationship).

I would add the following properties that are more specific to floating point comparisons

if a < b < c, then a =~ c implies a =~ b (closer values should also be equal)
if a, b, m >= 0 then a =~ b implies a + m =~ b + m (larger values with the same difference should also be equal)
if 0 <= λ < 1 then a =~ b implies λa =~ λb (perhaps less obvious to argument for).

Those properties already give strong constrains on possible near-equality functions. The function proposed above verifies them. Perhaps one or several otherwise obvious properties are missing.

When one think of =~ as a family of equality relationship =~[Ɛ,t] parameterized by Ɛ and abs_th, one could also add

if Ɛ1 < Ɛ2 then a =~[Ɛ1,t] b implies a =~[Ɛ2,t] b (equality for a given tolerance implies equality at a higher tolerance)
if t1 < t2 then a =~[Ɛ,t1] b implies a =~[Ɛ,t2] b (equality for a given imprecision implies equality at a higher imprecision)

The proposed solution also verifies these.

c++ implementation question: can ```(std::abs(a) + std::abs(b))``` ever be greater than ```std::numeric_limits::max()``` ? — anneb, May 08 '20 at 04:03
The parameter names in your code appear to be reversed. The 'relth' parameter is being used as an absolute threshold, whilst the 'epsilon' parameter is being used as a relative threshold. — andypea, Feb 01 '21 at 22:30
@andypea Thanks. Actually it's "just" terrible naming -- I switched to a much more meaningful `abs_th`. — P-Gn, Feb 22 '21 at 09:51
while the code can be translated you can't compare it with the chosen solution since they're to different languages, hence you can't say "chose this over that".. — L4ZZA, Dec 01 '21 at 11:39
Nice graphics but FLT_MIN is far too small to be of practical use as an absolute tolerance. Your reltol is typically about 1.5e-5 and abstol 1.2e-38. That means that your abstol will only kick in for values around 4e-34 — Paul Floyd, Jan 07 '23 at 15:47

score 82 · Accepted Answer · edited Apr 03 '16 at 09:43

82

Comparing for greater/smaller is not really a problem unless you're working right at the edge of the float/double precision limit.

For a "fuzzy equals" comparison, this (Java code, should be easy to adapt) is what I came up with for The Floating-Point Guide after a lot of work and taking into account lots of criticism:

public static boolean nearlyEqual(float a, float b, float epsilon) {
    final float absA = Math.abs(a);
    final float absB = Math.abs(b);
    final float diff = Math.abs(a - b);

    if (a == b) { // shortcut, handles infinities
        return true;
    } else if (a == 0 || b == 0 || diff < Float.MIN_NORMAL) {
        // a or b is zero or both are extremely close to it
        // relative error is less meaningful here
        return diff < (epsilon * Float.MIN_NORMAL);
    } else { // use relative error
        return diff / (absA + absB) < epsilon;
    }
}

It comes with a test suite. You should immediately dismiss any solution that doesn't, because it is virtually guaranteed to fail in some edge cases like having one value 0, two very small values opposite of zero, or infinities.

An alternative (see link above for more details) is to convert the floats' bit patterns to integer and accept everything within a fixed integer distance.

In any case, there probably isn't any solution that is perfect for all applications. Ideally, you'd develop/adapt your own with a test suite covering your actual use cases.

edited Apr 03 '16 at 09:43

phuclv

37,963
15
156
475

answered Feb 06 '11 at 20:29

Michael Borgwardt

342,105
78
482
720

This "two very small values opposite of zero" only applies to denormalized numbers, doesn't it? – toochin Feb 06 '11 at 20:46
1

@toochin: depends on how large a margin of error you want to allow for, but it becomes most obviously a problem when you consider the denormalized number closest to zero, positive and negative - apart from zero, these are closer together than any other two values, yet many naive implementations based on relative error will consider them to be too far apart. – Michael Borgwardt Feb 06 '11 at 20:53
Another question, why not `return diff < epsilon*the_smallest_normalized_fp_number` in case `a*b == 0`? – toochin Feb 06 '11 at 21:10
@toochin: well, that might be a better choice than mine, but it's still somewhat arbitrary. If it really matters, you should decide according to the actual application's requirements. – Michael Borgwardt Feb 06 '11 at 22:24
2

Hmm. You have a test `else if (a * b == 0)`, but then your comment on the same line is `a or b or both are zero`. But aren't these two different things? E.g., if `a == 1e-162` and `b == 2e-162` then the condition `a * b == 0` will be true. – Mark Dickinson Feb 07 '11 at 07:53
@Mark: Very good point, will have to think about whether that makes a difference for the logic. – Michael Borgwardt Feb 07 '11 at 09:09
Why don't you use java's Math.ulp function, anyway? – toochin Feb 07 '11 at 17:11
1

@toochin: mainly because the code is supposed to be easily portable to other languages which may not have that functionality (it was added to Java only in 1.5 as well). – Michael Borgwardt Feb 08 '11 at 08:56
1

If that function is used very much (every frame of a video game for example) I'd rewrite it in assembly with epic optimizations. – Nov 09 '11 at 09:47
1

Great guide and great answer, especially considering the `abs(a-b) – Franz D. Mar 05 '15 at 00:45
@MichaelBorgwardt I'm trying to port this function to Swift. Which value should I use instead of `Float.MIN_NORMAL`? The [closest constants I found](https://llvm.org/svn/llvm-project/cfe/trunk/lib/Headers/float.h) were: `FLT_MIN` and `FLT_TRUE_MIN`. – Senseful Dec 04 '15 at 06:51
@Senseful it says `define FLT_TRUE_MIN __FLT_DENORM_MIN__`, so that's apparently denormalized, so `FLT_MIN` must be the equivalent of `Float.MIN_NOMAL` - but to be certain, just look at the value :) – Michael Borgwardt Dec 04 '15 at 09:01
@MichaelBorgwardt: Thanks, I believe you are correct given these results: `FLT_MIN = 1.175494e-38` and `FLT_TRUE_MIN = 1.401298e-45`. Furthermore, `FLT_MIN.isNormal == true` and `FLT_TRUE_MIN.isNormal == false`. – Senseful Dec 06 '15 at 01:52
1

This is not a good answer. In practice, you almost always need to use absolute difference between the numbers, such that `nearlyEqual(a, b, eps)` gives the same answer as `nearlyEqual(a-b, 0, eps)` for all possible `a` and `b`. You need to know the tolerance in the numbers that you are working with, and then use a simple absolute approach such as https://stackoverflow.com/a/19050413/908621. – fishinear Jan 22 '19 at 18:41
Out of curiosity, what is the value of `epsilon`, and how I will define the value of `epsilon`? Is this from direct system value? I am using python. So for, `epsilon` I can use `sys.float_info.epsilon` but how can I get the value for `Float.MIN_NORMAL`? – 0Knowledge May 03 '21 at 02:03
@MichaelBorgwardt, I would like to know, if I have `(0.5, 30.77) or (-0.5, -30.77)` numbers, should I use the method you proposed? Because they are not very equal. – 0Knowledge May 03 '21 at 02:23

score 16 · Answer 3 · edited Apr 03 '16 at 09:45

16

I had the problem of Comparing floating point numbers A < B and A > B Here is what seems to work:

if(A - B < Epsilon) && (fabs(A-B) > Epsilon)
{
    printf("A is less than B");
}

if (A - B > Epsilon) && (fabs(A-B) > Epsilon)
{
    printf("A is greater than B");
}

The fabs--absolute value-- takes care of if they are essentially equal.

edited Apr 03 '16 at 09:45

phuclv

37,963
15
156
475

answered Oct 12 '12 at 23:10

tech_loafer

177
1
2

3

No need to use `fabs` at all, if you make the first test `if (A - B < -Epsilon)` – fishinear Oct 10 '19 at 14:03

score 11 · Answer 4 · edited Apr 03 '16 at 09:45

We have to choose a tolerance level to compare float numbers. For example,

final float TOLERANCE = 0.00001;
if (Math.abs(f1 - f2) < TOLERANCE)
    Console.WriteLine("Oh yes!");

One note. Your example is rather funny.

double a = 1.0 / 3.0;
double b = a + a + a;
if (a != b)
    Console.WriteLine("Oh no!");

Some maths here

a = 1/3
b = 1/3 + 1/3 + 1/3 = 1.

1/3 != 1

Oh, yes..

Do you mean

if (b != 1)
    Console.WriteLine("Oh no!")

score 4 · Answer 5 · edited Apr 03 '16 at 09:44

4

Idea I had for floating point comparison in swift

infix operator ~= {}

func ~= (a: Float, b: Float) -> Bool {
    return fabsf(a - b) < Float(FLT_EPSILON)
}

func ~= (a: CGFloat, b: CGFloat) -> Bool {
    return fabs(a - b) < CGFloat(FLT_EPSILON)
}

func ~= (a: Double, b: Double) -> Bool {
    return fabs(a - b) < Double(FLT_EPSILON)
}

edited Apr 03 '16 at 09:44

phuclv

37,963
15
156
475

answered Jun 24 '15 at 01:29

Andy Poes

1,682
15
19

score 1 · Answer 6 · answered Jan 16 '15 at 17:07

Adaptation to PHP from Michael Borgwardt & bosonix's answer:

class Comparison
{
    const MIN_NORMAL = 1.17549435E-38;  //from Java Specs

    // from http://floating-point-gui.de/errors/comparison/
    public function nearlyEqual($a, $b, $epsilon = 0.000001)
    {
        $absA = abs($a);
        $absB = abs($b);
        $diff = abs($a - $b);

        if ($a == $b) {
            return true;
        } else {
            if ($a == 0 || $b == 0 || $diff < self::MIN_NORMAL) {
                return $diff < ($epsilon * self::MIN_NORMAL);
            } else {
                return $diff / ($absA + $absB) < $epsilon;
            }
        }
    }
}

score 1 · Answer 7 · answered Aug 20 '18 at 13:47

You should ask yourself why you are comparing the numbers. If you know the purpose of the comparison then you should also know the required accuracy of your numbers. That is different in each situation and each application context. But in pretty much all practical cases there is a required absolute accuracy. It is only very seldom that a relative accuracy is applicable.

To give an example: if your goal is to draw a graph on the screen, then you likely want floating point values to compare equal if they map to the same pixel on the screen. If the size of your screen is 1000 pixels, and your numbers are in the 1e6 range, then you likely will want 100 to compare equal to 200.

Given the required absolute accuracy, then the algorithm becomes:

public static ComparisonResult compare(float a, float b, float accuracy) 
{
    if (isnan(a) || isnan(b))   // if NaN needs to be supported
        return UNORDERED;    
    if (a == b)                 // short-cut and takes care of infinities
        return EQUAL;           
    if (abs(a-b) < accuracy)    // comparison wrt. the accuracy
        return EQUAL;
    if (a < b)                  // larger / smaller
        return SMALLER;
    else
        return LARGER;
}

score 0 · Answer 8 · edited Apr 03 '16 at 09:44

0

The standard advice is to use some small "epsilon" value (chosen depending on your application, probably), and consider floats that are within epsilon of each other to be equal. e.g. something like

#define EPSILON 0.00000001

if ((a - b) < EPSILON && (b - a) < EPSILON) {
  printf("a and b are about equal\n");
}

A more complete answer is complicated, because floating point error is extremely subtle and confusing to reason about. If you really care about equality in any precise sense, you're probably seeking a solution that doesn't involve floating point.

edited Apr 03 '16 at 09:44

phuclv

37,963
15
156
475

answered Feb 06 '11 at 19:15

nelhage

2,734
19
14

What if he is working with really small floating point numbers, like 2.3E-15 ? – toochin Feb 06 '11 at 19:30
1

I'm working with a range of roughly [10E-14, 10E6], not quite machine epsilon but very close to it. – Mike Bailey Feb 06 '11 at 19:40
2

Working with small numbers is not a problem if you keep in mind that you have to work with *relative* errors. If you don't care about relatively large error tolerances, the above would be OK if you'd replace it the condition with something like `if ((a - b) < EPSILON/a && (b - a) < EPSILON/a)` – toochin Feb 06 '11 at 19:46
2

The code given above is also problematic when you deal with very large numbers `c`, because once your number is large enough, the EPSILON will be smaller than the machine precision of `c`. E.g. suppose `c = 1E+22; d=c/3; e=d+d+d;`. Then `e-c` may well be considerably greater than 1. – toochin Feb 06 '11 at 19:54
1

For examples, try `double a = pow(8,20); double b = a/7; double c = b+b+b+b+b+b+b; std::cout< – toochin Feb 06 '11 at 20:15
Some more information: Even though the range of numbers is fairly large, my code is almost always dealing with numbers that are close to within an order of magnitude. I won't have situations where I have something like a = 1.0, b = 1000000. It'll be more like a = 1.001, b = 2.002. – Mike Bailey Feb 06 '11 at 20:23

score 0 · Answer 9 · edited Apr 03 '16 at 09:50

0

I tried writing an equality function with the above comments in mind. Here's what I came up with:

Edit: Change from Math.Max(a, b) to Math.Max(Math.Abs(a), Math.Abs(b))

static bool fpEqual(double a, double b)
{
    double diff = Math.Abs(a - b);
    double epsilon = Math.Max(Math.Abs(a), Math.Abs(b)) * Double.Epsilon;
    return (diff < epsilon);
}

Thoughts? I still need to work out a greater than, and a less than as well.

edited Apr 03 '16 at 09:50

phuclv

37,963
15
156
475

answered Feb 06 '11 at 20:17

Mike Bailey

12,479
14
66
123

`epsilon` should be `Math.abs(Math.Max(a, b)) * Double.Epsilon;`, or it will always be smaller than `diff` for negative `a` and `b`. And I think your `epsilon` is too small, the function might not return anything different from the `==` operator. Greater than is `a < b && !fpEqual(a,b)`. – toochin Feb 06 '11 at 20:30
1

Fails when both values are exactly zero, fails for Double.Epsilon and -Double.Epsilon, fails for infinities. – Michael Borgwardt Feb 06 '11 at 20:38
1

The case of infinities isn't a concern in my particular application, but is duely noted. – Mike Bailey Feb 06 '11 at 21:18

score 0 · Answer 10 · answered Mar 22 '21 at 16:26

I came up with a simple approach to adjusting the size of epsilon to the size of the numbers being compared. So, instead of using:

iif(abs(a - b) < 1e-6, "equal", "not")

if a and b can be large, I changed that to:

iif(abs(a - b) < (10 ^ -abs(7 - log(a))), "equal", "not")

I suppose that doesn't satisfy all the theoretical issues discussed in the other answers, but it has the advantage of being one line of code, so it can be used in an Excel formula or an Access query without needing a VBA function.

I did a search to see if others have used this method and I didn't find anything. I tested it in my application and it seems to be working well. So it seems to be a method that is adequate for contexts that don't require the complexity of the other answers. But I wonder if it has a problem I haven't thought of since no one else seems to be using it.

If there's a reason the test with the log is not valid for simple comparisons of numbers of various sizes, please say why in a comment.

score -1 · Answer 11 · edited Apr 03 '16 at 09:47

-1

The best way to compare doubles for equality/inequality is by taking the absolute value of their difference and comparing it to a small enough (depending on your context) value.

double eps = 0.000000001; //for instance

double a = someCalc1();
double b = someCalc2();

double diff = Math.abs(a - b);
if (diff < eps) {
    //equal
}

edited Apr 03 '16 at 09:47

phuclv

37,963
15
156
475

answered Feb 06 '11 at 19:16

pnt

1,916
1
20
29

score -1 · Answer 12 · answered Feb 06 '11 at 19:28

You need to take into account that the truncation error is a relative one. Two numbers are about equal if their difference is about as large as their ulp (Unit in the last place).

However, if you do floating point calculations, your error potential goes up with every operation (esp. careful with subtractions!), so your error tolerance needs to increase accordingly.

How should I do floating point comparison?

12 Answers12

TL;DR

Graphics, please?

Where to switch between relative and absolute comparison?

Mathematical rumbling

Linked

Related