Float comparisons failing without any obvious reason (32-bit X86 on Linux)

Question

I have stumbled upon an interesting case of comparing (==, !=) float types. I encountered this problem while porting my own software from windows to linux. It's a bit of a bummer. The relevant code is the following:

template<class T> class PCMVector2 {
public:
   T x, y;

public:
   bool operator == ( const PCMVector2<T>& a ) const {
       return x == a.x && y == a.y;
   }
   bool operator != ( const PCMVector2<T>& a ) const {
       return x != a.x || y != a.y;
   }
   // Mutable normalization
   PCMVector2<T>& Normalize() { 
      const T l = 1.0f / Length();
      x *= l;
      y *= l;
      return *this;
   }
   // Immutable normalization
   const PCMVector2<T> Normalized() { 
      const T l = 1.0f / Length();
      return PCMVector2<T>(x*l,y*l);
   }
   // Vector length
   T Length() const { return sqrt(x*x+y*y); }
};

I cleverly designed a unit test functions which check all available functionality regarding those classes, before porting to linux. And, in contrast to msvc, the g++ doesn't complain, but gives incorrect results at runtime.

I was stumped, so I did some additional logging, type-puns, memcmp's, etc. and they all showed that memory is 1:1 the same! Anyone has any ideas about this?

My flags are: -Wall -O2 -j2

Thanks in advance.

EDIT2: The failed test case is:

vec2f v1 = vec2f(2.0f,3.0f);
v1.Normalize(); // mutable normalization
if( v1 != vec2f(2.0f,3.0f).Normalized() ) //immutable normalization
    // report failure

Note: Both normalizations are the same, and yield same results (according to memcmp).

RESOLUTION: Turns out that you should never trust the compiler about floating numbers! No matter how sure you are about the memory you compare. Once data goes to the registers, it can change, and you have no control over it. After some digging regarding registers, I found this neat source of information. Hope it's useful to someone in the future.

You may find [this](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) relevant. — vsoftco, May 08 '15 at 17:59
I **strongly** suggest you don't compare floating point values for equality. — Cory Kramer, May 08 '15 at 17:59
possible duplicate of [Most effective way for float and double comparison](http://stackoverflow.com/questions/17333/most-effective-way-for-float-and-double-comparison) — Cory Kramer, May 08 '15 at 18:00
Sometimes floating point registers are larger than the floating point type you've specified, and comparisons can depend on whether the value was rounded for storage or not. — Mark Ransom, May 08 '15 at 18:00
So, what are the values? The exact values, if you can find them. — harold, May 08 '15 at 18:00
What does `Normalize()` do? Does it, for instance, divide each value by `1.0`? — Max Lybbert, May 08 '15 at 18:06
The `vec2f` or `Normalized` functions are undefined. Please make sure your test case is complete and self contained. — that other guy, May 08 '15 at 18:06
@Cyber, interesting that the link you gave doesn't contain the best method that I've found: http://stackoverflow.com/questions/6837007/comparing-float-double-values-using-operator — Mark Ransom, May 08 '15 at 18:11
I guarantee that this part is defined and self-contained. I shared only the relevant details. However, my question has been updated, so there are no more discrepancies. — Dimo Markov, May 08 '15 at 18:11
A double floating point may be converted to an extended floating point, or not (which depends on unpredictable compiler optimization/conversion). — , May 08 '15 at 18:17
@DimoMarkov `Prior to anyone informing me that one can not simply compare floats,` Why do you think that reason can be thrown away? You know that you should not compare floats for equality, so you did just that and now you have results that surprise you. — PaulMcKenzie, May 08 '15 at 18:25
Yes, I agree with you. The registers just flew out of my mind. Sorry for the misdirection, and thanks for the help. — Dimo Markov, May 08 '15 at 18:30

score 4 · Accepted Answer · answered May 08 '15 at 18:17

Floating point CPU registers can be larger than the floating point type you're working with. This is especially true with float which is typically only 32 bits. A calculation will be computed using all the bits, then the result will be rounded to the nearest representable value before being stored in memory.

Depending on inlining and compiler optimization flags, it is possible that the generated code may compare one value from memory with another one from a register. Those may compare as unequal, even though their representation in memory will be bit-for-bit identical.

This is only one of the many reasons why comparing floating-point values for equality is not recommended. Especially when, as in your case, it appears to work some of the time.

I see. It seems that zero optimisation has fixed the problem! However, if I would want to build WITH optimisations, what would you suggest will be best practice while keeping the single precision floats? If possible, of course... — Dimo Markov, May 08 '15 at 18:21
Nevermind, I guess it comes down to ULP or tolerance checking. Thanks for the help. I simply didn't expect the register stuff. — Dimo Markov, May 08 '15 at 18:27

Float comparisons failing without any obvious reason (32-bit X86 on Linux)

1 Answers1