What operation can I apply to float to ensure the same outcome up to a certain precision?

Question

I have two floating points, x, y. They are "supposed to be" the same, but with floating-point, you never know, so x == y may return False.

So I am looking for an operation O() which ensures that if x and y really are "supposed to be the same" (where that supposedness is indicated by a tolerance), then O(x) == O(y). I am NOT looking to just compare if abs(x - y) <= tol, that is of course easily done, but it is not what I want, I want two representations O(x), O(y) that I need to later use.

Is there such an operation?

I first thought about rounding as an obvious solution, but it doesn't actually work. For example, x = 1, y = 0.999999999999 are "clearly supposed to be the same", but with rounding x = 1, y = 0.999999, and so that operation clearly does not satisfy O(x) == O(y).

I don't think it's possible. No matter how you define your `O()` function, there will be some value `x` such that `O(x) != O(x+e)` for some small epsilon `e`. That will prevent it from behaving as you desire. — Tom Karzes, May 10 '21 at 14:51
See my answer. You can't define such a function, because it violates the notion of transitivity which *defines* equality. — chepner, May 10 '21 at 15:21
@namesake: Proof : assume that O(x)=O(x+e) for all x. Then also O(x+e) = O(x+2e) for all x, and O(x) = O(x+2e). If we define `e_max` as the largest value of e for which the property holds, it follows that e_max=2*e_max, thus e_max=0. And indeed O(x)=O(x+0). — MSalters, May 10 '21 at 15:35
@namesake: You're the one asking for a function `O(x)` such that `O(x)==O(y)`. Of course that function depends on `x`, `O(x) = 7` won't do you any good. But for the general case, see my comment on chepner's answer. You're asking for non-transitive equivalence relations among a finite set of normal (excluding NaN) floating point numbers. That's just not going to work. You don't even need numbers for it. You can even prove that for equivalence classes of colors, without using a single epsilon. — MSalters, May 10 '21 at 16:05

chepner · Answer 1 · 2021-05-10T15:39:19.267

3

Consider a tolerance t = 0.01, and x = 0.99, y = 1.00, and z = 1.01. Clearly, x and y are "equal", as are y and z, but x and z are not.

Now, how should O be defined? Clearly, O(x) != O(z), so that x and z aren't considered equal. But that means that O(y) would need to return either O(x) or O(z), depending on the "context", and a function simply cannot return two different values for the same argument.

The idea of two values being equal within some tolerance is necessarily dependent on both values, not just one of them.

edited May 10 '21 at 15:39

answered May 10 '21 at 15:19

chepner

497,756
71
530
681

This is of course a bit simplified, but the idea is correct. `O()` defines equivalence classes, each of which will be a few floating-point values out of the 4 billion possible values. So you have many millions of these equivalence classes, and they have millions of borders. A pair of floats straddling such a border will by definition be in different equivalence classes. – MSalters May 10 '21 at 15:31
@namesake: Say `{x',y'} = {x+y/2, x+y/2}` if `|x-y| < epsilon` ? You can do that. But note that this may cause problems with O(a,b), O(b,c) and O(c,d). The problem now is that `b' != b'`, hardly an improvement. – MSalters May 10 '21 at 16:18

What operation can I apply to float to ensure the same outcome up to a certain precision?

1 Answers1