24

This is not a duplicate of the famous Is floating point math broken, even if it looks like one at first sight.

I'm reading a double from a text file using fscanf(file, "%lf", &value); and comparing it with the == operator against a double literal. If the string is the same as the literal, will the comparision using == be true in all cases?

Example

Text file content:

7.7

Code snippet:

double value;
fscanf(file, "%lf", &value);     // reading "7.7" from file into value

if (value == 7.7)
   printf("strictly equal\n");

The expected and actual output is

strictly equal

But this supposes that the compiler converts the double literal 7.7 into a double exactly the same way as does the fscanf function, but the compiler may or may not use the same library for converting strings to double.

Or asked otherwise: does the conversion from string to double result in a unique binary representation or may there be slight implementation dependent differences?

Live demonstration

YSC
  • 38,212
  • 9
  • 96
  • 149
Jabberwocky
  • 48,281
  • 17
  • 65
  • 115
  • Why not use the [std::strtod](http://en.cppreference.com/w/cpp/string/byte/strtof) in the first place, because C++? – Ron Oct 18 '17 at 12:50
  • @ron yes, std::strtod could be used, but the question remains the same. And it applies to C and C++. – Jabberwocky Oct 18 '17 at 12:51
  • Try reading some of these (they will start to explain some the floating point and library issues): https://randomascii.wordpress.com/category/floating-point/ especially: https://randomascii.wordpress.com/2013/07/16/floating-point-determinism/ – Richard Critten Oct 18 '17 at 12:52
  • For inf and NaN there are no literals, I suppose you want to exclude those? – Baum mit Augen Oct 18 '17 at 12:54
  • 9
    Throwing [Is floating point math broken?](https://stackoverflow.com/q/588004/4389800) or [What Every Computer Scientist Should Know About Floating-Point Arithmetic](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) in the face for any question on floating point math is like throwing the C standard for any C question. – P.P Oct 18 '17 at 12:54
  • @BaummitAugen yes `inf` and `NaN` are excluded. – Jabberwocky Oct 18 '17 at 12:54
  • @MichaelWalz> did you try changing the floating point environment to see whether the test still succeeds? Most notably the rounding direction? – spectras Oct 18 '17 at 12:56
  • 2
    I don't think there's a definitive answer. This is a QOI issue IMO. I think it most likely will not result in the same binary representation *in general*. – StoryTeller - Unslander Monica Oct 18 '17 at 12:56
  • @spectras I did a few tests on a few different platforms, the outcome was always the same. – Jabberwocky Oct 18 '17 at 12:57
  • If you are *ever* testing two floating point numbers for exact equality, then you are doing something wrong. You should always at least be checking that the difference is less than some epsilon value... – Sean Burton Oct 18 '17 at 13:10
  • @MichaelWalz In your platform testing, consider reporting the value of `FLT_EVAL_METHOD`. When 0 or 1, I expect code will report `value == 7.7`. When 2, I expect `value != 7.7`. – chux - Reinstate Monica Oct 18 '17 at 21:53

4 Answers4

18

From the c++ standard:

[lex.fcon]

... If the scaled value is in the range of representable values for its type, the result is the scaled value if representable, else the larger or smaller representable value nearest the scaled value, chosen in an implementation-defined manner...

emphasis mine.

So you can only rely on equality if the value is strictly representable by a double.

Community
  • 1
  • 1
Richard Hodges
  • 68,278
  • 7
  • 90
  • 142
  • 1
    @YSC I'm surprised cppreference mentions it. That site is improving every day. – Richard Hodges Oct 18 '17 at 13:03
  • 1
    You can actually join the effort: [cppreference is a wiki](http://en.cppreference.com/w/Cppreference:FAQ#What.3F_This_is_a_wiki.3F_Can_I_change_stuff.3F)! – YSC Oct 18 '17 at 13:06
  • 2
    @RichardHodges What is _[lex.fcon]_? – Jabberwocky Oct 18 '17 at 13:10
  • 2
    That certainly sounds like the safest assumption to make. My experience has been that it's easy to get burned by things like this. For one thing, I'm not sure how much I trust the conversion done by `fscanf` to match the conversion done by the compiler at compile time. Another is that values sometimes end up in registers, with higher precision than one might expect. But if the representation is exact in a double, then it seems like it should be safe. – Tom Karzes Oct 18 '17 at 13:11
  • 1
    @MichaelWalz it's the name of a section in the c++ standard. – Richard Hodges Oct 18 '17 at 13:17
  • @Tom Yeah relying on that seems incredibly, incredibly fragile. "This code works because X can be represented perfectly by a IEEE-754 double which our compiler happens to guarantee is used for doubles".. – Voo Oct 18 '17 at 18:55
17

About C++, from cppreference one can read:

[lex.fcon] (§6.4.4.2)

The result of evaluating a floating constant is either the nearest representable value or the larger or smaller representable value immediately adjacent to the nearest representable value, chosen in an implementation-defined manner (in other words, default rounding direction during translation is implementation-defined).

Since the representation of a floating literal is unspecified, I guess you cannot conclude about its comparison with a scanf result.


About C11 (standard ISO/IEC 9899:2011):

[lex.fcon] (§6.4.4.2)

Recommended practice

7 The translation-time conversion of floating constants should match the execution-time conversion of character strings by library functions, such as strtod, given matching inputs suitable for both conversions, the same result format, and default execution-time rounding.

So clearly for C11, this is not guaranteed to match.

YSC
  • 38,212
  • 9
  • 96
  • 149
  • This sounds convincing. So I'll replace the `==` with the call to some `AmostEqual` function. – Jabberwocky Oct 18 '17 at 13:09
  • 2
    @MichaelWalz> you tagged both C and C++. I'm wondering whether they agree on that one. In [C99](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf), normative annex F, subsection 7.2 reads: “During translation the IEC 60559 default modes are in effect: — The rounding direction mode is rounding to nearest [...]”. So it would appear your sample code is guaranteed to work in C99, as long as you don't change the floating point environment of your code. – spectras Oct 18 '17 at 13:33
  • So, @YSC, your point with respect to C11 is that it is only a recommendation, not a requirement, that the translation-time and runtime conversions agree? – John Bollinger Oct 18 '17 at 16:55
  • @YSC> the recommendation is about the fact that they should match. However, annex F, which is normative gives the exact rules for translation-time, so by explicitly configuring the floating-point environment with those, the match should be guaranteed. – spectras Oct 18 '17 at 17:04
  • 1
    @Michael One simple reason how this could fail on x86 is that reading a value from memory will only get you 64bit precision, while a value could have 80-bit precision when in a register.Loading a constant might very well be done directly via registers, while the fscanf loaded value might have been stored in memory somewhere. For that reason alone it's pretty much never safe to assume equality between floats. – Voo Oct 18 '17 at 18:50
  • @spectras a norme about translation-time rules and nothing about execution-time library's string functions does not guarantee a match ;) – YSC Oct 19 '17 at 07:35
2

If the string is the same as the literal, will the comparison using == be true in all cases?

A common consideration not yet explored: FLT_EVAL_METHOD

#include <float.h>
...
printf("%d\n", FLT_EVAL_METHOD);

2 evaluate all operations and constants to the range and precision of the long double type.

If this returns 2, then the math used in value == 7.7 is long double and 7.7 treated as 7.7L. In OP's case, this may evaluate to false.

To account for this wider precision, assign values which will removes all extra range and precision.

scanf(file, "%lf", &value);
double seven_seven = 7.7;
if (value == seven_seven)
  printf("strictly equal\n");

IMO, this is a more likely occurring problem than variant rounding modes or variations in library/compiler conversions.


Note that this case is akin to the below, a well known issue.

float value;
fscanf(file, "%f", &value);
if (value == 7.7)
   printf("strictly equal\n");

Demonstration

#include <stdio.h>
#include <float.h>
int main() {
  printf("%d\n", FLT_EVAL_METHOD);
  double value;
  sscanf("7.7", "%lf", &value);
  double seven_seven = 7.7;
  if (value == seven_seven) {
    printf("value == seven_seven\n");
  } else {
    printf("value != seven_seven\n");
  }
  if (value == 7.7) {
    printf("value == 7.7\n");
  } else {
    printf("value != 7.7\n");
  }
  return 0;
}

Output

2
value == seven_seven
value != 7.7

Alternative Compare

To compare 2 double that are "near" each other, we need a definition of "near". A useful approach is to consider all the finite double values sorted into a ascending sequence and then compare their sequence numbers from each other. double_distance(x, nextafter(x, 2*x) --> 1

Following code makes various assumptions about double layout and size.

#include <assert.h>

unsigned long long double_order(double x) {
  union {
    double d;
    unsigned long long ull;
  } u;
  assert(sizeof(double) == sizeof(unsigned long long));
  u.d = x;
  if (u.ull & 0x8000000000000000) {
    u.ull ^= 0x8000000000000000;
    return 0x8000000000000000 - u.ull;
  }
  return u.ull + 0x8000000000000000;
}

unsigned long long double_distance(double x, double y) {
  unsigned long long ullx = double_order(x);
  unsigned long long ully = double_order(y);
  if (x > y) return ullx - ully;
  return ully - ullx;
}

....
printf("%llu\n", double_distance(value, 7.7));                       // 0
printf("%llu\n", double_distance(value, nextafter(value,value*2)));  // 1
printf("%llu\n", double_distance(value, nextafter(value,value/2)));  // 1

Or just use

if (nextafter(7.7, -INF) <= value && value <= nextafter(7.7, +INF)) {
  puts("Close enough");
}
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • I'd almost upvote, but the ``union`` type punning breaks C++ aliasing restrictions (which are a recurring topic also). A well-formed solution would e.g. use ``memcpy``. – Arne Vogel Oct 18 '17 at 16:24
  • @ArneVogel Fair enough about C++. Yet post is also tagged C which this answer does not break aliasing restrictions. Given `fscanf()`, OP code smells more like C than C++. – chux - Reinstate Monica Oct 18 '17 at 16:28
2

There's no guarantee.

You can hope that the compiler uses a high quality algorithm for the conversion of literals, and that the standard library implementation uses a high quality conversion as well, and two high quality algorithms should agree quite often.

It's also possible that both use the exact same algorithm (for example, the compiler converts the literal by putting the characters into a char array and calling sscanf.

BTW. I had one bug caused by the fact that a compiler didn't convert the literal 999999999.5 exactly. Replaced it with 9999999995 / 10.0 and everything was fine.

gnasher729
  • 51,477
  • 5
  • 75
  • 98