30

I'm working on different memory block manipulation functions and during benchmarks I noticed, that my implementation of the IsEqualRange(double* begin1, double* end1, double* begin2, double* end2) is much faster then the std::equals(...) on MSVC and GCC as well. Further investigation showed, that doubles and floats are not block compared by memcmp, but in a for loop one by one.

In what situation does binary comparison of floats lead to incorrect result? When is it ok to binary compare (equality) array of floats/doubles? Are there other fundamental types where I shouldn't use memcmp?

Zoltan Tirinda
  • 769
  • 8
  • 25
  • 1
    Related: https://stackoverflow.com/questions/25808445/why-cant-the-floating-point-types-compare-by-using-memcmp-function – DeiDei Jan 03 '19 at 10:49
  • 1
    This [answer](https://stackoverflow.com/a/54008873/841108) to a related question is relevant to yours – Basile Starynkevitch Jan 03 '19 at 10:52
  • 3
    @BasileStarynkevitch That is related to the nature of the floating points where not every number can be represented. Doesn't say much about the equality in binary form. – Zoltan Tirinda Jan 03 '19 at 10:57
  • 2
    Similar: https://stackoverflow.com/q/8044862/560648 – Lightness Races in Orbit Jan 03 '19 at 11:25
  • @ZoltanTirinda But this affect equality because any non trivial calculation will encounter this not representable numbers during processing and this will affect final result. Every time someone use `==` then is good to mention this that it will not always do what you want (because of previous operations). You can know this but for other do not. – Yankes Jan 03 '19 at 12:15
  • @Yankes: That is true, but unrelated to this problem. – Mooing Duck Jan 03 '19 at 18:48

3 Answers3

50

The first thing I would do if I were you is to check your optimisation settings.

It's fine to use memcmp for an array of floating points but note that you could get different results to element-by-element ==. In particular, for IEEE754 floating point:

  1. +0.0 is defined to compare equal to -0.0.

  2. NaN is defined to compare not-equal to NaN.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • 3
    Also note that, technically, the C and C++ standards do not require the use of IEEE 754 to implement `float` and `double`. Most "reasonable" (modern) platforms will do so in practice, of course. But if you're uncertain that IEEE 754 is in use, then you have very few guarantees of how the `==` operator will behave on arbitrary bit patterns. A "weird" architecture could for example decide that `((float)0x00000000) == ((float)0x12345678)`, but `memcmp` will obviously consider those bytes different. – Kevin Jan 04 '19 at 01:25
  • @Kevin: Indeed you're correct and I've narrowed the answer. I wonder if you fancy answering this question in more detail yourself; it seems to have attracted a lot of attention? – Bathsheba Jan 04 '19 at 08:18
  • Thanks for the vote of confidence, but I don't really think I can add much to your answer as it now stands. – Kevin Jan 04 '19 at 14:18
14

The main issue is nan values, as these are never equal to themselves. There is also two representations of 0 (+0 and -0) that are equal but not binary equal.

So strictly speaking, you cannot use memcmp for them, as the answer would be mathematically incorrect.

If you know that you don't have nan or 0 values, then you can use memcmp.

Matthieu Brucher
  • 21,634
  • 7
  • 38
  • 62
  • 5
    Zero has two representations in IEEE-754 binary floating-point, denoted as −0 and +0. They represent the same mathematical value and should be reported as equal, which `==` does but `memcmp` does not. – Eric Postpischil Jan 03 '19 at 13:10
  • 1
    Yes indeed, that's why I said the main issue, of course, there is 0 as well. – Matthieu Brucher Jan 03 '19 at 13:30
  • 1
    There's another way to get two numbers that compare numerically equal but not binary equal: if a denormalized float somehow leaks into your system. That's not supposed to happen (IEEE-754 specifies which representation should be used if a number can be represented in more than one way), but it's something to keep in mind if you're operating on, say, data files provided by a hostile user. – Mark Jan 04 '19 at 00:39
-6

Binary compare works with too much precision for many actual applications. For example, in javascript, 0.2 / 10.0 != 0.1 * 0.2 If you have two variables that both end up with very, very slight rounding errors, they will be unequal despite representing the "same" number.

David Rice
  • 208
  • 2
  • 4
  • 5
    Yes, but in that case both `==` and `memcmp(…)` (and, presumably, `std::equals(…)`) would agree, so that's not addressing the question.  (Worth being aware of, though.) – gidds Jan 03 '19 at 17:44
  • 1
    They are not the "same" number if their binary representation is different. If you want both numbers to be the same with two decimal precision then you round them. Then you define how you want to round them. You document that and have a reliable way of testing logic. What do you call the same number? Is this good enough for you 0.099998 == 0.99997? Is this good enough 0.099 == 0.098? Is this good enough 0.9 == 1.0? – FCin Jan 03 '19 at 18:34
  • @FCin So, are you saying that 0.2*0.1 != 0.2/10.0? They aren't the same number because of rounding errors, not because the numbers they should represent are different. If your code only uses hardcoded numbers, then it's fine, but if any of the numbers are random or user-generated, then using == is a bad decision for floating point numbers and likely to introduce bugs. – David Rice Jan 03 '19 at 18:42
  • 1
    As @gidds said, that's completely irrelevant to the question. The *only* time `==` differs from `memcmp(...)` occurs with `NaN` and +0 and -0. Also, there is a difference you're missing here: C floating point isn't mathematically perfect. If you put in `0.2 / 10.0` and `0.1 * 0.2` and expect them to evaluate to the same value, you need to read more about floating point. – user124 Jan 03 '19 at 20:16
  • @user124 You're right - in the abstract, in a white paper, comparing a floating point number with == is fine. But if you're trying to actually write a functioning, working program, there's good reason to avoid that. I'm not "missing" the point that in C fp numbers aren't mathematically perfect - I'm saying that because of that it's not a good candidate for comparing in that manner. – David Rice Jan 03 '19 at 20:21
  • @DavidRice I think the point being made here is that **IF** `==` or `std::equals` is acceptable, then `memcmp` is acceptable iff there are no zeros or NaNs in the array. – The Great Java Jan 03 '19 at 20:24
  • @TheGreatJava The questions asked were "In what situation does binary comparison of floats lead to incorrect result? When is it ok to binary compare (equality) array of floats/doubles?". My answer is "most of the time when you're building any kind of production application because floating point numbers shouldn't be compared that way". – David Rice Jan 03 '19 at 21:06
  • 1
    @DavidRice is correct. A more complete answer referencing epsilon can be found [here](https://stackoverflow.com/a/77735/398460). – jacknad Jan 03 '19 at 21:14
  • @DavidRice you're right, I got hung up on the text of the other answer. – The Great Java Jan 03 '19 at 22:05
  • @DavidRice If your program can't tolerate the inexactness of floating point numbers, it's usually better to use exact rationals or exact decimal precision. Since the OP chose floats, most likely, they can tolerate the imprecision. If you're not convinced of that, a clarifying comment asking about it would have been better than an answer. Note that Basile left a comment raising the issue. Answers coming from an interpreted language perspective (like JS or Ruby) rarely have all that much relevance to native C and C++ code. – jpmc26 Jan 04 '19 at 00:57
  • @jpmc26 The question is whether this is about math and programming in the abstract, or if this is about someone trying to build a toaster. I'm not an academic, I'm not writing white papers. I write code that runs and supports my users, and I approach code questions from that perspective. "When is it ok to binary compare floats/doubles?" in my experience - it's not. It's not about the language, it's about the simple, practical nature of floating point number comparisons. – David Rice Jan 04 '19 at 16:02