How does DBL_MAX addition work?

Question

Code

#include<stdio.h>
#include<limits.h>
#include<float.h>

int f( double x, double y, double z){
  return  (x+y)+z == x+(y+z);
}

int ff( long long x, long long y, long long z){
  return  (x+y)+z == x+(y+z);
}

int main()
{
    printf("%d\n",f(DBL_MAX,DBL_MAX,-DBL_MAX));     
    printf("%d\n",ff(LLONG_MAX,LLONG_MAX,-LLONG_MAX));
    return 0;
}

Output

0
1

I am unable to understand why both functions work differently. What is happening here?

Your additions are going beyond the numeric limits: what you do is undefined behavior. Binary encoding for double and long are different so it can be considered "normal" to have different results here, but again your code is undefined behavior and should be considered as unpredictable — rocambille, Aug 08 '16 at 09:57
@utkarsh13 What would you expect the result to be if you add the largest possible number and the largest possible nummer? — molbdnilo, Aug 08 '16 at 09:57
@utkarsh13 Minor correction, the function f will be defined if you're using IEEE 754. — 2501, Aug 08 '16 at 10:00
Also even if you used smaller doubles you might still get a false result when calling f() because order matters when it comes to how double expressions are rounded. You generally can't == compare doubles. — chasep255, Aug 08 '16 at 10:00
@chasep255 Yes, you are right I might get false, but I tried it for some small values and result is true. — dazzieta, Aug 08 '16 at 10:03
try things like `f(1.1, 2.345, 4.22)`, not small values that can be represented in binary floating-point type — phuclv, Aug 08 '16 at 11:06

score 4 · Accepted Answer · edited Jan 27 '21 at 01:22

4

In the eyes of the C++ and the C standard, the integer version definitely and the floating point version potentially invoke Undefined Behavior because the results of the computation x + y is not representable in the type the arithmetic is performed on.^† So both functions may yield or even do anything.

However, many real world platforms offer additional guarantees for floating point operations and implement integers in a certain way that lets us explain the results you get.

Considering f, we note that many popular platforms implement floating point math as described in IEEE 754. Following the rules of that standard, we get for the LHS:

DBL_MAX + DBL_MAX = INF

and

INF - DBL_MAX = INF.

The RHS yields

DBL_MAX - DBL_MAX = 0

and

DBL_MAX + 0 = DBL_MAX

and thus LHS != RHS.

Moving on to ff: Many platforms perform signed integer computation in twos complement. Twos complement's addition is associative, so the comparison will yield true as long as optimizer does not change it to something that contradicts twos complement rules.

The latter is entirely possible (for example see this discussion), so you cannot rely on signed integer overflow doing what I explained above. However, it seems that it "was nice" in this case.

^†Note that this never applies to unsigned integer arithmetic. In C++, unsigned integers implement arithmetic modulo 2^NumBits where NumBits is the number of bits of the type. In this arithmetic, every integer can be represented by picking a representative of its equivalence class in [0, 2^NumBits - 1]. So this arithmetic can never overflow.

For those doubting that the floating point case is potential UB: N4140 5/4 [expr] says

If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined.

which is the case. The inf and NaN stuff is allowed, but not required in C++ and C floating point math. It is only required if std::numeric_limits::is_iec559<T> is true for floating point type in question. (Or in C, if it defines __STDC_IEC_559__ . Otherwise, the Annex F stuff need not apply.) If either of the iec indicators guarantees us IEEE semantics, the behavior is well defined to do what I described above.

edited Jan 27 '21 at 01:22

John Zwinck

239,568
38
324
436

answered Aug 08 '16 at 10:33

Baum mit Augen

49,044
25
144
182

2

Surely the floating point arithmetic is not undefined behaviour? Dealing with infinity and NaN values is well defined in floating point types, is it not? – pmdj Aug 08 '16 at 10:38
@pmdj [It is](https://stackoverflow.com/questions/17588419/when-a-float-variable-goes-out-of-the-float-limits-what-happens), which allows `-ffast-math` for example. `INF` and `NaN` come from IEEE floating point, not standard C++ which allows, but doesn't require them. – Baum mit Augen Aug 08 '16 at 10:39
"not even *really* reliable UB" - what does that mean? – anatolyg Aug 08 '16 at 10:53
@anatolyg As I explained, the optimizer may and in some cases will cause results that differ from strict twos complement, like optimizing out overflow asserts. – Baum mit Augen Aug 08 '16 at 10:56
I'm not really understanding this. In C99 at least, it says in F.3: "C operators and functions provide IEC 60559 required and recommended facilities as listed below. — The +, −, *, and / operators provide the IEC 60559 add, subtract, multiply, and divide operations." which suggests to me that it is in fact not undefined behavior despite the same "representable" quote existing in §6.5/5. I would also remove the integer overflow stuff from the answer because I really think it distracts from the point. – uh oh somebody needs a pupper Aug 08 '16 at 11:09
1

@uhohsomebodyneedsapupper *"An implementation that defines __STDC_IEC_559__ shall conform to the specifications in this annex."* in F1. If it doesn't, it need not follow IEEE. (I hope n1256 is a reasonable doc, just found that by googling.) – Baum mit Augen Aug 08 '16 at 11:17
@uhohsomebodyneedsapupper For the integer stuff: *"I am unable to understand why both functions work differently."* So I have to explain how they work to explain why they are different. (The meta-comment about my footnotes I wrote before was dumb, sry) – Baum mit Augen Aug 08 '16 at 11:25
C11 draft standard n1570 `5.2.4.2.2 Characteristics of floating types ` does not require `__STDC_IEC_559__` as far as I can tell, and makes the overflow not undefined if `double` can represent infinities. – EOF Aug 08 '16 at 11:54
@BaummitAugen I feel that you are trying to say something important here, but fail to understand what. You say that signed integer overflow is not even *really* reliable UB. There is a bunch of adjectives that don't make sense to me. Signed integer overflow is UB, that's OK. But *reliable* UB? *Really* reliable UB? *Not really* reliable UB? Please explain what these expressions mean. Please take into account that I (like some other people here) am not a native English speaker and don't always understand modern jargon. – anatolyg Aug 08 '16 at 12:01
1

@EOF *"if `double` can represent infinities"* Which it may or may not if we don't have `__STDC_IEC_559__`. – Baum mit Augen Aug 08 '16 at 12:05
@anatolyg I'm not a native speaker either, maybe that's why it's unclear. I'll try to improve it. – Baum mit Augen Aug 08 '16 at 12:06
Floating point overflow isn't undefined behavior in C, see the [C89](http://port70.net/~nsz/c/c89/c89-draft.html#4.5.1) and [C99](http://port70.net/~nsz/c/c99/n1256.html#7.12.1p4) spec. – nwellnhof Aug 08 '16 at 12:38
1

@nwellnhof Those quotes don't apply here because the operator `+` is not a function in ``. Also note that `HUGE_VAL` etc. (which are returned on overflow in a `` function) only need to expand to some infinity if `__STDC_IEC_559__` is defined. In general, they expand some positive constant expression. (See 7.12/2 in N1256 and note again that Anex F only applies if the aforementioned macro is defined.) – Baum mit Augen Aug 08 '16 at 12:47
I agree with most of the answer, except the reliable ub. I think the answer would be better without this concept. – 2501 Aug 08 '16 at 15:27
@2501 The problem is probably in my English. If you can find a better wording for what's going, please tell me. Maybe "explainable" instead of "reliable"? – Baum mit Augen Aug 08 '16 at 15:35
In my opinion any type of ub is ub, and for example, one shouldn't count on dereferencing NULL to segfault. It may happen, but a future version of the compiler or the optimizer may do whatever they want if they detect it. – 2501 Aug 08 '16 at 15:40
@2501 Well yeah, I explicitly mentioned this in my answer. Still, one can explain the effects of UB and often enough learn a thing or two about the implementations. – Baum mit Augen Aug 08 '16 at 15:44
@BaummitAugen i agree. Perhaps change the wording to explainable ub or something like that. – 2501 Aug 08 '16 at 15:46

How does DBL_MAX addition work?

1 Answers1