14

In order to compare two floats (float64) for equality in Go, my superficial understanding of IEEE 754 and binary representation of floats makes me think that this is a good solution:

func Equal(a, b float64) bool {
    ba := math.Float64bits(a)
    bb := math.Float64bits(b)
    diff := ba - bb
    if diff < 0 {
        diff = -diff
    }
    // accept one bit difference
    return diff < 2
}

The question is: Is this a more generic, more precise, and more efficient, way to compare two arbitrarily large or small floats for "almost equalness", than the old abs(diff) < epsilon hack? My reasoning being that if one allows only one bit difference in the binary representation, then the compared numbers certainly could not be any more equal, apart from strict equality, which obviously (as pointed out in the comments) can be checked with == for floats.

Note: I have edited the question to make it more clear.

augustzf
  • 2,385
  • 1
  • 16
  • 22
  • Can't you simply do `a == b` ? Note that fixed-size floating point values are prone to not being able to accurately represent the value you *want* it to represent, such as the value `0.1` or `0.3`. Your best option is probably, as it is in most programming languages, to subtract one from the other, do an `ABS` operation and compare to a threshold value that you consider "close enough to be equal for my purpose". – Lasse V. Karlsen Dec 25 '17 at 14:16
  • But again, what is wrong with simply `a == b` ? Please explain why that is not an option for you. – Lasse V. Karlsen Dec 25 '17 at 14:20
  • The problem with a threshold value is that with really small or really large numbers the threshold value will be either smaller or larger than the numbers you're actually comparing. – augustzf Dec 25 '17 at 14:21
  • That is correct, but it is nontypical in a given domain to have very very low values combined with very very large values in a fixed size floating point without being aware of it. **But again, since you've tried to implement bit-wise equality of floating point values, what is wrong with `a == b`**?` – Lasse V. Karlsen Dec 25 '17 at 14:24
  • Because a==b gives the wrong answer: `a := 0.1; b := 0.2; fmt.Println(a+b == 0.3) // false` – augustzf Dec 25 '17 at 14:52
  • 2
    Please google what every programmer should know about floating point. 0.1+0.2 is a classical example. – Volker Dec 25 '17 at 15:23
  • Yes, and that is a common problem with floating point values **in every language that uses fixed-size floating point**. – Lasse V. Karlsen Dec 25 '17 at 17:25

3 Answers3

36

Don't use bit representation of float64 as it don't make sense in a lot of cases. Just subtract two numbers to find out how much they differ:

package main

import (
    "fmt"
    "math"
)

const float64EqualityThreshold = 1e-9

func almostEqual(a, b float64) bool {
    return math.Abs(a - b) <= float64EqualityThreshold
}

func main() {
    a := 0.1
    b := 0.2
    fmt.Println(almostEqual(a + b, 0.3))
}
Arman Ordookhani
  • 6,031
  • 28
  • 41
  • 4
    @augustzf http://floating-point-gui.de/errors/comparison/#look-out-for-edge-cases – kostix Dec 25 '17 at 15:18
  • 1
    @augustzf You could use something like `math.Abs(a - b) <= float64EqualityThreshold * (math.Abs(a) + math.Abs(b)) `. @kostix link seems even better. – Arman Ordookhani Dec 25 '17 at 15:30
  • 1
    @augustzf: There is no good generic comparison, and [no good generic comparison is possible](https://stackoverflow.com/questions/17040813/f-how-to-compare-floats/17047601#17047601). – Eric Postpischil Dec 25 '17 at 16:17
  • 2
    @EricPostpischil I understand what you're saying but "no good generic" seems a bit much too me. Simple almostEqual functions can be used in most of situations. If somebody don't tolerate aggregated tiny errors he/she should not use float point calculation at all. – Arman Ordookhani Dec 25 '17 at 20:30
  • @ArmanOrdookhani: They can be used, but they are not good. Most Stack Overflow questions about comparing for equality in floating-point arise when somebody is experimenting with floating-point as a learning exercise. They do not have a real use. The only others I recall showing an actual need are unit-testing. (I answered one [here](https://stackoverflow.com/a/21120104/298225).) In that case, we can bound the errors due to floating-point rounding and trust that it is lower than errors due to bugs, so there is a clear value we can use for the tolerance to make a useful comparison. – Eric Postpischil Dec 25 '17 at 20:41
  • 3
    @ArmanOrdookhani: Other than that, answers that show “methods” to do floating-point comparison generally use bad examples. They use arbitrary tolerances that are not generally applicable, sometimes using `FLT_EPSILON` or `DBL_EPSILON` improperly (e.g., not scaled for the magnitudes of the numbers involved, which is only one of the errors made with them), arbitrarily switching between absolute or relative error because the author worked out something that kind of worked, not due to any theory of operation, and so on. They are kludges, not good generic solutions. – Eric Postpischil Dec 25 '17 at 20:44
32

No, this is not the correct way compare floating-point values.

You have not actually stated your real problem—there is some reason you are trying to compare two floating-point numbers, but you have not said what it is.

Floating-point arithmetic is designed to perform approximate arithmetic. It is normal that there will be an accumulation of rounding errors in floating-point operations. These errors will generally be different when values are calculated in different ways, so floating-point arithmetic should not be expected to produce equal results.

In your example, these operations occurred:

  • The decimal numeral “0.1” was converted to float64 (IEEE-754 64-bit binary floating-point). This produced the value 0.1000000000000000055511151231257827021181583404541015625, which is the closest float64 value to 0.1.

  • The decimal numeral “0.2” was converted to float64. This produced 0.200000000000000011102230246251565404236316680908203125, which is the closest float64 value to 0.2.

  • These were added. This produced 0.3000000000000000444089209850062616169452667236328125. In addition to the rounding errors that occurred when 0.1 and 0.2 were rounded to the nearest values in float64, this contains some additional rounding error because the exact sum cannot be represented in float64.

  • The decimal numeral “0.3” was converted to float64. This produced 0.299999999999999988897769753748434595763683319091796875, which is the closest float64 value to 0.3.

As you can see, the result of adding 0.1 and 0.2 has accumulated different rounding errors from 0.3, so they are unequal. No correct test for equality will report they are equal. And, importantly, the errors that occurred in this example are specific to this example—different sequences of floating-point operations will have different errors, and the accumulated errors are not limited to the low bits of the numbers.

Some people try to compare by testing whether the difference is less than some small value. This is can be okay in some applications, but is it okay in your application? We do not know what you are trying to do, so we do not know what problems will occur. Tests that allow for a small error sometimes report incorrect results, either false positives (because they accept as equal numbers that would not be equal if computed with exact mathematics) or false negatives (because they reject equality for numbers that would be equal if computed with exact mathematics). Which of these errors is worse for your application? Will one of them cause a machine to break or a person to be harmed? Without knowing that, nobody can advise which incorrect result is acceptable, or even if either is.

Additionally, how large should the tolerance for error be? The total error that can occur in a calculation depends on the sequence of operations performed and the numbers involved. Some applications will have only a small final rounding error, and some applications can have massive errors. Nobody can give a recommendation about what value to use for the tolerance without knowing more about your specific sequence of operations. Also, the solution might not be to accept a tolerance in comparing numbers but to redesign your calculations to avoid error, or at least to reduce it.

No general solution for comparing floating-point values for “equality” exists because it is impossible for any such solution to exist.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • I have edited the question now. I agree that for concrete applications, one probably knows what tolerance is acceptable and should define equality with that in mind. But my question is less application-specific and more about what can be achieved by inspecting the underlying bit representation of the numbers. – augustzf Dec 25 '17 at 16:37
  • 2
    @augustzf: Essentially nothing useful in this situation can be achieved by inspecting the bits. Generally, any useful tests can be performed with normal floating-point operations. If you are just wondering how floating-point works and should be used, rather than trying to solve a specific application problem, then the answer is that floating-point is not designed for this. – Eric Postpischil Dec 25 '17 at 16:56
  • 1
    Yes, I've come to the same conclusion now after reading an [this article](https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/) on the same topic. – augustzf Dec 25 '17 at 16:58
0

In C++, I use a function (search for nearly_equal()) which in Go would look like this:

func nearlyEqual(a, b, epsilon float64) {

    // already equal?
    if(a == b) {
        return true
    }

    diff := math.Abs(a - b)
    if a == 0.0 || b == 0.0 || diff < math.SmallestNonzeroFloat64 {
        return diff < epsilon * math.SmallestNonzeroFloat64
    }

    return diff / (math.Abs(a) + math.Abs(b)) < epsilon
}

This computes a difference between input a and b then make sure the difference is under a given epsilon. The division removes the magnitude so the compare is simple.


The problem is that floating points use a number of bits as defined in their mantissa. What the mantissa represents depends on the exponent. So if you just compare the bits, you first need to "shift them properly". The operation above does that on the last line. The division is equivalent to a shift.

Assuming we had very large integers, we could have fixed point numbers with about 700 bits that have about 350 bits on the left of the decimal point and another 350 bits after the decimal point:

700             350               0
XXX ... XXXXXXX '.' XXXXXXX ... XXX

The float64 mantissa is 56 bits, so in that number representation above, most bits would be zeroes except for those 56 bits. If one number has those 56 bits far on the left and another with 56 bits far on the right, then those two numbers are very different and the above detects that.

With these fixed point numbers we could do something more or less like this:

// assume "fp" numbers are fixed point numbers and the shift works on those
fp_a := a << (350 + a.exponent)
fp_b := b << (350 + b.exponent)

fp_diff := math.Abs(fp_a - fp_b)

return fp_diff < fp_epsilon

The problems is that it would not be practical to implement numbers of 700 bits.

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156