1

According to this post, when comparing a float and a double, the float should be treated as double. The following program, does not seem to follow this statement. The behaviour looks quite unpredictable. Here is my program:

void main(void)
{
    double a = 1.1;  // 1.5
    float b = 1.1;   // 1.5
    printf("%X  %X\n", a, b);
    if ( a == b)
        cout << "success " <<endl;
    else
        cout << "fail" <<endl;
}
  • When I run the following program, I get "fail" displayed.
  • However, when I change a and b to 1.5, it displays "success".

I have also printed the hex notations of the values. They are different in both the cases. My compiler is Visual Studio 2005

Can you explain this output ? Thanks.

Community
  • 1
  • 1
user1414696
  • 307
  • 4
  • 15
  • 4
    Welcome to the world of floating point. –  Jun 27 '13 at 12:28
  • 2
    It's a rounding issue, and this is exactly the reason why you generally don't want to use `==` for comparing floating point numbers. – Violet Giraffe Jun 27 '13 at 12:28
  • "when comparing a float and a double, the float should be treated as double" **You misunderstand.** The literal value `1.23` -- with no suffix -- is interpreted by the compiler as being a `double`. To specify a `float` literal, you must use the `f` suffix, as with `1.23f`. – John Dibling Jun 27 '13 at 13:00
  • @JohnDibling - when the value `1.23` is stored in a `float` (as in `float b = 1.1;` is gets converted to `float`, pretty much as if the constant had been written as `1.23f`. The problem in the question has nothing to do with this suffix. – Pete Becker Jun 27 '13 at 13:36
  • 1
    @JohnDibling You misunderstand. 1.1 (or 1.5) is converted to float when b is initialized; making it 1.1f or 1.5f wouldn't change anything. The problem is that 1.1 is not exactly expressible, so 1.1f (or (float)1.1) != 1.1 – Jim Balter Jun 27 '13 at 13:36
  • @PeteBecker: I'm well aware that the value is converted to a `float`. I'm also well aware that this has nothing to do with why OP is having problems. Maybe that's why I posted this a comment. It seems quite clear to me that OP did misunderstand the meaning of the linked post. – John Dibling Jun 27 '13 at 13:40
  • I suspect Pete Becker and Jim Balter may be addressing the initialization of objects with numerals while John Dibling is speaking to the fact that an answer of the question referred to in the question used `0.7` rather than `0.7f` in the code. Thus, the OP of this question misunderstood the answer, which states that writing `f == 0.7f` could have obtained the behavior desired in that case. – Eric Postpischil Jun 27 '13 at 13:42
  • 2
    @JohnDibling - your comment says "You misunderstand" in reference to the statement: "when comparing a float and a double, the float should be treated as double". The latter is absolutely correct. It is not a misunderstanding. – Pete Becker Jun 27 '13 at 13:42
  • @JohnDibling You're wrong; Pete and I are right. – Jim Balter Jun 27 '13 at 13:43
  • @JimBalter: OK, good for you. – John Dibling Jun 27 '13 at 13:44
  • @EricPostpischil The cited post has exactly the same problem as here ... a double value that cannot be exactly represented is converted to float and then compared to the original double value. – Jim Balter Jun 27 '13 at 13:45
  • 1
    @EricPostpischil - that's an interesting reading, but I do no think it's warranted by the words in the actual question. – Pete Becker Jun 27 '13 at 13:46
  • @EricPostpischil There's no point of contention, just some confused people. – Jim Balter Jun 27 '13 at 13:47
  • 2
    @PeterBecker: Let me try again. There is a misunderstanding. The OP of this question interpreted the other answer as stating that “when comparing a float and a double, the float should be treated as double” (their words). But the other answer states that “`0.7` is treated as a double” (its words, my markup to distinguish source text), not that the float should be treated as a double. John Dibling is correct to explain that the source of the error, in the other problem, is essentially that `0.7` was written instead of `0.7f`. – Eric Postpischil Jun 27 '13 at 13:50
  • @JimBalter: You will not succeed at contending there is no contention. – Eric Postpischil Jun 27 '13 at 13:50
  • You're confused about what counts as success of an action. – Jim Balter Jun 27 '13 at 13:51
  • @EricPostpischil *the other answer states that “0.7 is treated as a double"* -- Unlike JD, you've actually pointed out the relevant context. I suppose this is what JD meant, but it was nearly inscrutable as expressed. (Not entirely inscrutable, since you managed to figure it out.) JD quoted the OP's own words, which happen to be correct, even though they are a misstatement of other words. – Jim Balter Jun 27 '13 at 13:57
  • @eric: thanks for helping me to clarify. I thought my comment was clear, but obviously people had some difficult with it. – John Dibling Jun 27 '13 at 14:19
  • The post http://blog.frama-c.com/index.php?post/2011/11/08/Floating-point-quiz contains more examples in the vein of `d == f`, `d == 0.7f`, … – Pascal Cuoq Jun 27 '13 at 17:57
  • 1
    @PeteBecker: The main reason for my lack of understanding was that I was not aware that certain floats/doubles are not representable in computers, and that the computer makes some approximations. Now, it is clarified. Thanks. – user1414696 Jun 27 '13 at 18:11

4 Answers4

10
float f = 1.1;
double d = 1.1;
if (f == d)

In this comparison, the value of f is promoted to type double. The problem you're seeing isn't in the comparison, but in the initialization. 1.1 can't be represented exactly as a floating-point value, so the values stored in f and d are the nearest value that can be represented. But float and double are different sizes, so have a different number of significant bits. When the value in f is promoted to double, there's no way to get back the extra bits that were lost when the value was stored, so you end up with all zeros in the extra bits. Those zero bits don't match the bits in d, so the comparison is false. And the reason the comparison succeeds with 1.5 is that 1.5 can be represented exactly as a float and as a double; it has a bunch of zeros in its low bits, so when the promotion adds zeros the result is the same as the double representation.

Pete Becker
  • 74,985
  • 8
  • 76
  • 165
  • Just stumbled over this, thought it fits, why not share: http://randomascii.wordpress.com/2012/06/26/doubles-are-not-floats-so-dont-compare-them/ – x29a Jan 30 '14 at 12:22
5

I found a decent explanation of the problem you are experiencing as well as some solutions.

See How dangerous is it to compare floating point values?

Just a side note, remember that some values can not be represented EXACTLY in IEEE 754 floating point representation. Your same example using a value of say 1.5 would compare as you expect because there is a perfect representation of 1.5 without any loss of data. However, 1.1 in 32-bit and 64-bit are in fact different values because the IEEE 754 standard can not perfectly represent 1.1.

See http://www.binaryconvert.com

double a = 1.1 --> 0x3FF199999999999A

Approximate representation = 1.10000000000000008881784197001

float  b = 1.1 --> 0x3f8ccccd

Approximate representation = 1.10000002384185791015625

As you can see, the two values are different.

Also, unless you are working in some limited memory type environment, it's somewhat pointless to use floats. Just use doubles and save yourself the headaches.

If you are not clear on why some values can not be accurately represented, consult a tutorial on how to covert a decimal to floating point.

Here's one: http://class.ece.iastate.edu/arun/CprE281_F05/ieee754/ie5.html

Community
  • 1
  • 1
ffhaddad
  • 1,653
  • 13
  • 16
  • I'll update the answer to reflect the point. Thanks! – ffhaddad Jun 27 '13 at 13:50
  • I find the hexadecimal notation partly helpful, because these are almost the same bits, but the 30 last ones... For example, float f=1.1f and double d=1.1f would compare pretty much equal whatever their different bits 0x3FF19999A0000000 and 0x3F8CCCCD. The only problem here is that 1.1 (the double) cannot be represented exactly in single precision float because the last (53-24=29) bits of the significand are not zero. – aka.nice Jun 27 '13 at 18:10
1

I would regard code which directly performs a comparison between a float and a double without a typecast to be broken; even if the language spec says that the float will be implicitly converted, there are two different ways that the comparison might sensibly be performed, and neither is sufficiently dominant to really justify a "silent" default behavior (i.e. one which compiles without generating a warning). If one wants to perform a conversion by having both operands evaluated as double, I would suggest adding an explicit type cast to make one's intentions clear. In most cases other than tests to see whether a particular double->float conversion will be reversible without loss of precision, however, I suspect that comparison between float values is probably more appropriate.

Fundamentally, when comparing floating-point values X and Y of any sort, one should regard comparisons as indicating that X or Y is larger, or that the numbers are "indistinguishable". A comparison which shows X is larger should be taken to indicate that the number that Y is supposed to represent is probably smaller than X or close to X. A comparison that says the numbers are indistinguishable means exactly that. If one views things in such fashion, comparisons performed by casting to float may not be as "informative" as those done with double, but are less likely to yield results that are just plain wrong. By comparison, consider:

double x, y;
float f = x;

If one compares f and y, it's possible that what one is interested in is how y compares with the value of x rounded to a float, but it's more likely that what one really wants to know is whether, knowing the rounded value of x, whether one can say anything about the relationship between x and y. If x is 0.1 and y is 0.2, f will have enough information to say whether x is larger than y; if y is 0.100000001, it will not. In the latter case, if both operands are cast to double, the comparison will erroneously imply that x was larger; if they are both cast to float, the comparison will report them as indistinguishable. Note that comparison results when casting both operands to double may be erroneous not only when values are within a part per million; they may be off by hundreds of orders of magnitude, such as if x=1e40 and y=1e300. Compare f and y as float and they'll compare indistinguishable; compare them as double and the smaller value will erroneously compare larger.

supercat
  • 77,689
  • 9
  • 166
  • 211
0

The reason why the rounding error occurs with 1.1 and not with 1.5 is due to the number of bits required to accurately represent a number like 0.1 in floating point format. In fact an accurate representation is not possible.

See How To Represent 0.1 In Floating Point Arithmetic And Decimal for an example, particularly the answer by @paxdiablo.

Community
  • 1
  • 1
Adrian G
  • 775
  • 6
  • 11