Can I guarantee the C++ compiler will not reorder my calculations?

Question

I'm currently reading through the excellent Library for Double-Double and Quad-Double Arithmetic paper, and in the first few lines I notice they perform a sum in the following way:

std::pair<double, double> TwoSum(double a, double b)
{
    double s = a + b;
    double v = s - a;
    double e = (a - (s - v)) + (b - v);
    return std::make_pair(s, e);
}

The calculation of the error, e, relies on the fact that the calculation follows that order of operations exactly because of the non-associative properties of IEEE-754 floating point math.

If I compile this within a modern optimizing C++ compiler (e.g. MSVC or gcc), can I be ensured that the compiler won't optimize out the way this calculation is done?

Secondly, is this guaranteed anywhere within the C++ standard?

This ain't Javascript. If something as simple as that weren't well-defined by the spec, C++ programmers would scream bloody murder. But since compilers and their optimizers are written by humans (for the moment), I'd do calculations a few isomorphic ways and ensure it came out the same before feeding the output to a neurosurgery robot. For most other purposes, just assume it will behave predictably. — HostileFork says dont trust SE, Oct 05 '11 at 02:36
@HostileFork: Well thankfully this won't be used in any critical applications such as that :) I'm reading through this for my own learning. — Mike Bailey, Oct 05 '11 at 02:39
I'm going to test that code using different compiler optimization settings and see. :) But my bet is that the results will be the same regardless. — , Oct 05 '11 at 03:26

score 7 · Answer 1 · edited Dec 01 '20 at 17:03

7

You might like to look at the g++ manual page: http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Optimize-Options.html#Optimize-Options

Particularly -fassociative-math, -ffast-math and -ffloat-store

According to the g++ manual it will not reorder your expression unless you specifically request it.

edited Dec 01 '20 at 17:03

Alexis Wilke

19,179
10
84
156

answered Oct 05 '11 at 02:54

teambob

1,994
14
19

I do not see any compiler option that guaranties consistent FP behaviour. – curiousguy Oct 05 '11 at 03:16
1

"The following options control compiler behavior regarding floating point arithmetic. These options trade off between speed and correctness. **All must be specifically enabled.**" – teambob Oct 05 '11 at 03:22
1

@teambob: Only true for gcc. The Intel C++ compiler defaults to `-fp-model fast=1`. – Ben Voigt Oct 05 '11 at 03:26
@BenVoigt What is "true of gcc"? – curiousguy Oct 05 '11 at 03:30
@curiousguy: That value-changing optimizations are disabled unless you specifically request them. That's true for gcc and MSVC, but ICC defaults to optimizations on. – Ben Voigt Oct 05 '11 at 03:34
@BenVoigt Are you saying that as long as you don't specifically request these specific value-changing optimizations, FP behaviour of gcc and MSVC will not change depending on optimisation level and the phase of the moon? – curiousguy Oct 05 '11 at 03:41
@curiousguy: Won't change based on phase of the moon. On those compilers, these optimizations are only performed if you turn them on. It seems self-evident that optimizations will be enabled by increasing optimization level. Are you referring to the optimization bundle flags `-O1`, `-O2`, `-Os`, etc? That linked gcc doc page says that the `-Ofast` bundle enables `-ffast-math`. – Ben Voigt Oct 05 '11 at 03:44
1

@BenVoigt: Yes, the information was specifically for gcc/g++. Thanks for the interesting information about ICC. – teambob Oct 05 '11 at 03:52
@BenVoigt As you say, the default on GCC is that **`-ffloat-store` is OFF**. So the default is **not safe**. "_Won't change based on phase of the moon._" Are you saying that the exact same C++ statements with the exact same input values will have the exact same output values in different context in the program? That the `inline` keyword will not change program behaviour? "_It seems self-evident that optimizations will be enabled by increasing optimization level._" It seems self-evident that we only care about optimisations that might change program behaviour. – curiousguy Oct 05 '11 at 04:05
@curiousguy: By "won't change based on phase of the moon", I mean that if you run the program today with a set of inputs and get certain results, then you run it tomorrow with the same inputs you will get the same results -- the passage of time and other events happening concurrently don't change behavior. And arguably `-ffloat-store` is *safe*, since it preserves the limits on rounding error. OTOH `-ffast-math` is unsafe, because rounding error is relative, and reordering expression evaluation can definitely increase the magnitude of error in the result. – Ben Voigt Oct 05 '11 at 04:13
@BenVoigt "And arguably -ffloat-store is safe, since it preserves the limits on rounding error." That's (almost) my point: `-ffloat-store` is **less unsafe** than `-fno-float-store`. And `-ffloat-store` is **not** the default. – curiousguy Oct 05 '11 at 04:27
@BenVoigt "_By "won't change based on phase of the moon", I mean that if you run the program today with a set of inputs and get certain results, then you run it tomorrow with the same inputs you will get the same results -- the passage of time and other events happening concurrently don't change behavior._" Is it formally guaranteed by the ISO standard? Is this still true if I recompile the program? If I recompile with/without inlining? (I recompile without changing any specifically FP-related options.) – curiousguy Oct 05 '11 at 04:31
@curiousguy: I mean that `-fno-float-store` is also arguably safe, since rounding is still within 1 *ulp*. And no, phase of the moon means (1) passage of time and/or (2) concurrent events. It doesn't mean recompiling with different options. Recompiling with different options definitely could change behavior. And I already commented that at least the `-Ofast` optimization bundle changes this behavior. Inlining should only affect behavior if you've chosen one of the "fast" models. – Ben Voigt Oct 05 '11 at 04:37
@BenVoigt "_And no, phase of the moon means (1) passage of time and/or (2) concurrent events. It doesn't mean recompiling with different options._" To me it means unrelated changes, like `-O2` or marking a function `inline`. "_Inlining should only affect behavior if you've chosen one of the "fast" models._" Do you agree that if `inline` can change FP output the compiling mode is FP unsafe? – curiousguy Oct 05 '11 at 04:56
@curiousguy: I would say that if `inline` can cause the FP output to change beyond the precision of the type, the model is unsafe. If the result just changes from the "nearest above" representation to the "nearest below", or vice versa, that's not unsafe. – Ben Voigt Oct 05 '11 at 05:01
@BenVoigt, @curiousguy: Seriously, both of you. Turn on SSE2 codegen and stop worrying about `-ffloat-store`. – Stephen Canon Oct 05 '11 at 14:37

paxdiablo · Accepted Answer · 2011-10-05T04:48:05.727

6

Yes, that is safe (at least in this case). You only use two "operators" there, the primary expression (something) and the binary something +/- something (additive).

Section 1.9 Program execution (of C++0x N3092) states:

Operators can be regrouped according to the usual mathematical rules only where the operators really are associative or commutative.

In terms of the grouping, 5.1 Primary expressions states:

A parenthesized expression is a primary expression whose type and value are identical to those of the enclosed expression. ... The parenthesized expression can be used in exactly the same contexts as those where the enclosed expression can be used, and with the same meaning, except as otherwise indicated.

I believe the use of the word "identical" in that quote requires a conforming implementation to guarantee that it will be executed in the specified order unless another order can give the exact same results.

And for adding and subtracting, section 5.7 Additive operators has:

The additive operators + and - group left-to-right.

So the standard dictates the results. If the compiler can ascertain that the same results can be obtained with different ordering of the operations then it may re-arrange them. But whether this happens or not, you will not be able to discern a difference.

edited Oct 05 '11 at 04:48

answered Oct 05 '11 at 02:47

paxdiablo

854,327
234
1,573
1,953

Unfortunately, I've seen many times different compiler behavior w.r.t. floating-point code optimization and I would not want to rely on bugs or absence of thereof in the compiler implementation. And I'm not sure we have a single widely spread compiler that fully conforms to the standard. Standards are good. Bugs are bad. – Alexey Frunze Oct 05 '11 at 03:14
1

"_So the standard dictates the results._" If so, then gcc does not provides any option to turn on standard behaviour. – curiousguy Oct 05 '11 at 03:18
Pretty sure that's wrong. The compiler is allowed to generate any code that yields within +/- 1 LSB of the correct result, IIRC. – Ben Voigt Oct 05 '11 at 03:23
@Ben, if you can find a citation for that, I'd accept it. I have quoted the standard and believe that to be the definitive reference. – paxdiablo Oct 05 '11 at 04:11
@BenVoigt "_+/- 1 LSB of the correct result_" Not sure what this means. Is what are the allowed results of `((1.+double_epsilon/2)-1.)`? – curiousguy Oct 05 '11 at 04:12
@paxdiablo Which quote defines the result of FP arithmetic? – curiousguy Oct 05 '11 at 04:14
@curiousguy: epsilon in the C++ standard is NOT mathematical epsilon related to rounding error. In your expression, `double_epsilon/2` yields zero, by definition of epsilon as used in the standard. Remember that floating-point values are stored as exponent and mantissa. If you change the mantissa by +/-1, that's what I'm talking about. – Ben Voigt Oct 05 '11 at 04:16
@curious, none of those quotes specify the accuracy of the result, only the ordering, which is what the question asked. – paxdiablo Oct 05 '11 at 04:16
@paxdiablo: Then your last sentence would mean "not permitted, if the results would differ *by more than the required accuracy*"? In that case I agree 100%. – Ben Voigt Oct 05 '11 at 04:21
@paxdiablo No. The question is: "If I compile this within a modern optimizing C++ compiler (e.g. MSVC or gcc), can I be ensured that the compiler won't optimize out the way this calculation is done?" The question could be rephrased: "Is this code working correctly without any special concern on most real world compilers?" and the answer is definitely no. – curiousguy Oct 05 '11 at 04:24
@BenVoigt "_In your expression, double_epsilon/2 yields zero, by definition of epsilon as used in the standard._" Hug? Anyway, you understand what I meant. – curiousguy Oct 05 '11 at 04:25
@curiousguy: Sorry, the usual notation is "+/-1 ulp" (unit in the last place). – Ben Voigt Oct 05 '11 at 04:27
@paxdiablo: The definition of a "rounding function", found in section 4.2 of ISO/IEC 10967 (and included by reference into C++) requires that "if $u \in R$ is between two adjacent values in X, `rnd(u)` selects one of those adjacent values". Which places an upper bound of 1 *ulp* on the error. – Ben Voigt Oct 05 '11 at 04:30
@paxdiablo: And wikipedia suggests that's also the accuracy required in IEEE-754: "The IEEE 754 specification—followed by all modern floating-point hardware—requires that the result of an elementary arithmetic operation (addition, subtraction, multiplication, division, and square root) be within 0.5 ULP of the mathematically exact result" http://en.wikipedia.org/wiki/Unit_in_the_last_place – Ben Voigt Oct 05 '11 at 04:33
@BenVoigt Do you agree that this means that only **two** possible results are allowed? – curiousguy Oct 05 '11 at 04:36
let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/4011/discussion-between-curiousguy-and-ben-voigt) – curiousguy Oct 05 '11 at 04:37
Sorry, @Ben, I still have to maintain that my answer is correct, at least for this case (though other more complex cases may have problems, so I'll adjust the answer). There is _zero_ scope for ambiguity in the expression `(a - (s - v)) + (b - v)` and the text in my second quote (the parentheses one) explicitly uses the word "identical". – paxdiablo Oct 05 '11 at 04:45
@paxdiablo: The value resulting from each step allows an error of +/-1 ulp. – Ben Voigt Oct 05 '11 at 04:57
@paxdiablo What kind of "ambiguity"? – curiousguy Oct 05 '11 at 04:59
1

@curious et al, by ambiguity, I meant a compiler is free to evaluate `a+b-c` as either `(a+b)-c` or `a+(b-c)` provided the result is within bounds (exact for integers, not quite exact for floating point). It is _not_ allowed to evaluate the specific `(a+b)-c` in any way other than `(a+b)` first, then `-c`. From what I understand, that is the effect of the word "identical" in my second quote from the standard. And the errors at each step are irrelevant because of that - the compiler is not permitted to re-order _in this case_ so the errors will accumulate in the same way no matter what. – paxdiablo Oct 05 '11 at 05:05
I think we've pretty well done this discussion to death by the way. It may be best to leave it, or ask a more targeted question elsewhere on SO. – paxdiablo Oct 05 '11 at 05:06
@paxdiablo Do you mean that `a+b-c` gives more freedom to the compiler than `(a+b)-c`? – curiousguy Oct 05 '11 at 05:26

score 5 · Answer 3 · answered Oct 05 '11 at 03:33

5

This is a very valid concern, because Intel's C++ compiler, which is very widely used, defaults to performing optimizations that can change the result.

See http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/lin/compiler_c/copts/common_options/option_fp_model.htm#option_fp_model

answered Oct 05 '11 at 03:33

Ben Voigt

277,958
43
419
720

Because it's not an answer? – Oct 25 '17 at 06:42
1

@Tibo: Look closer. I said *which* compiler will not preserve the result, *what command-line option* controls it, and that the default is potentially problematic. All of that is right there in my answer, you only need to click the link if you need greater detail. – Ben Voigt Oct 25 '17 at 14:35

score 3 · Answer 4 · answered Oct 05 '11 at 02:52

I would be quite surprised if any compiler wrongly assumed associativity of arithmetic operators with default optimising options.

But be wary of extended precision of FP registers.

Consult compiler documentation on how to ensure that FP values do not have extended precision.

score 1 · Answer 5 · answered Dec 02 '15 at 14:11

1

If you really need to, I think you can make a noinline function no_reorder(float x) { return x; }, and then use it instead of parenthesis. Obviously, it's not a particularly efficient solution though.

answered Dec 02 '15 at 14:11

Anton Knyazyev

464
4
7

score 0 · Answer 6 · answered Oct 05 '11 at 02:32

0

In general, you should be able to -- the optimizer should be aware of the properties of the real operations.

That said, I'd test the hell out of the compiler I was using.

answered Oct 05 '11 at 02:32

Charlie Martin

110,348
25
193
263

1

The x86 compiler is also aware that doing a store/load of every intermediate FP result is costly. – curiousguy Oct 05 '11 at 02:58

score 0 · Answer 7 · answered Oct 05 '11 at 02:34

0

Yes. The compiler will not change the order of your calculations within a block like that.

answered Oct 05 '11 at 02:34

In the specifications for the compiler. If you write x = (1 + 2) * 3; The result will be 9 on any worthy compiler. Period. To build on the OP's question. If you have a block of statements and one is a = b * c; d = a * e; The compiler will do them in that order. What if a wasn't initialized and it tried to compile the second statement first? – Oct 05 '11 at 02:49
3

@ephaitch: But what about `x = 1 + 1e25 - 1e25;` What do you expect from your "any worthy compiler"? – Ben Voigt Oct 05 '11 at 02:53
"_What if a wasn't initialized_" We are not talking about that. We are talking specifically about FP behaviour. – curiousguy Oct 05 '11 at 02:55
@Ben Voigt, that will be dependent on the compiler because of decimal precision. You may get 0 you may get 0.9...... or 1.0.......1 or something like that. But isn't that a different question? – Oct 05 '11 at 02:58
1

@ephaitch: No, this question is, in fact, precisely about the behavior of decimal precision in arithmetic. – GManNickG Oct 05 '11 at 03:02
1

Well, then I'm confused, so I'll stay out of this conversation. Because in re-reading the OP's post, he is asking if he can be assured that "the calculation follows that order of operations" once compiled. – Oct 05 '11 at 03:06
"_he is asking if he can be assured that_ (...)" He wants to know if the given code is guaranteed to work correctly. – curiousguy Oct 05 '11 at 03:08
@epihaitch: *decimal* precision? Where'd decimal come into this? And the possible results are: 0.0 (exactly), 1.0 (exactly), or 2.0 (exactly). And it depends on the order of operations. `1 + 1e25 - 1e25` quite possibly gives a different result than `1e25 - 1e25 + 1`, the latter is quite definitely 1.0 (exactly). – Ben Voigt Oct 05 '11 at 03:08
One last comment.. I was trying to say if the programmer wrote x = 1 + (1e25 - 1e25), that it will be compiled as written (as far as the order of operations). The result of that will be 1.0. If I am wrong, I apologize. Within every set of parenthesis, the OP has only two variables. The compiler WILL NOT change a - b to b - a. That would become a different value. – Oct 05 '11 at 03:19
@epihatch: Of course it won't change `a-b` to `b-a`, that's not a valid transformation under normal mathematical rules. But transforming `(a+b)-c` to `a+(b-c)` is valid under mathematics, and the compiler might make that transformation if `a` is a value just being calculated, to allow better pipelining. – Ben Voigt Oct 05 '11 at 03:32
@BenVoigt "_And the possible results are: 0.0 (exactly), 1.0 (exactly), or 2.0 (exactly)._" How do you derive that? – curiousguy Oct 05 '11 at 03:34
@curiousguy: Just knowing how binary number systems work. `(1+x)` (where `x` is the rounded representation of the literal `1e25`) might be exactly representable. If not, then if and only if there's one binary digit too few, then both `x` and `x+2` are representation, being equidistant either one can be chosen. Otherwise `x` is the unique closest representable value. – Ben Voigt Oct 05 '11 at 03:40

score 0 · Answer 8 · answered Oct 05 '11 at 02:37

0

Between compiler optimizations and out-of-order execution on the processor, it is almost a guarantee that things will not happen exactly as you ordered them.

HOWEVER, it is also guaranteed that this will NEVER change the result. C++ follows standard order of operations and all optimizations preserve this behavior.

Bottom line: Don't worry about it. Write your C++ code to be mathematically correct and trust the compiler. If anything goes wrong, the problem was almost certainly not the compiler.

answered Oct 05 '11 at 02:37

riwalk

14,033
6
51
68

1

Actually, because of the extended precision of the registers of the fp unit (notably on x86), compiler would have to store intermediate results in memory to get correctly rounded fp results. Many don't! – curiousguy Oct 05 '11 at 02:45
@curiosguy, can you elaborate? I've always been under the assumption that all FP operations simply use the guard bit when shifting the operands in order to reduce rounding error (as is specified in the IEEE standard), but the extra information in the guard bit is discarded after the final rounding. – riwalk Oct 05 '11 at 02:48
4

@Stargazer712, there are many examples of programs that give different results depending on whether their intermediate results were kept in an 80-bit floating point register or if they were stored to a 64-bit double. See http://stackoverflow.com/questions/7517588/is-this-an-g-optimization-bug for one example. – Mark Ransom Oct 05 '11 at 02:53
@curiousguy, Stargazer712: The extra precision of the FPU does reduce rounding error. And it will not be what is called out by the IEEE-754 spec. But that does not make it "incorrect", because the C++ spec does not require exact IEEE-754 results. Instead it refers to ISO/IEC 10967. – Ben Voigt Oct 05 '11 at 03:02
Basic operations should be done with **infinite precision**, then the result rounded, here to double precision, according to current rounding mode. But x86 FP registers have **excess precision** (excess mantissa bits and excess exponent bits). By default, x86 FP is non conformant. You have to store/load the results to get the wanted precision. – curiousguy Oct 05 '11 at 03:04
@BenVoigt "_the C++ spec does not require exact IEEE-754 results_" Where does C++ allow non deterministic FP results? – curiousguy Oct 05 '11 at 03:05
@curiousguy: Does any of this change for a specific compiler if you change the optimization level? That is the OP's question! – jman Oct 05 '11 at 03:07
1

"_Does any of this change for a specific compiler if you change the optimization level?_" I don't know any compiler that is not broken by default on x86. – curiousguy Oct 05 '11 at 03:11
@curiousguy: More *implementation-defined* than *non-deterministic*, I should think. – Ben Voigt Oct 05 '11 at 03:12
It's made implementation defined here: "There are three floating point types: `float`, `double`, and `long double`. The type `double` provides at least as much precision as `float`, and the type `long double` provides at least as much precision as `double`. The set of values of the type `float` is a subset of the set of values of the type `double`; the set of values of the type `double` is a subset of the set of values of the type `long double`. The value representation of floating-point types is implementation-defined." section 3.9.1 `[basic.fundamental]` – Ben Voigt Oct 05 '11 at 03:14
@skjaidev: Absolutely yes. See [MSVC `/fp`](http://msdn.microsoft.com/en-us/library/e7s85ffb.aspx) and the [Intel options](http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/lin/compiler_c/fpops/common/fpops_fp_arch.htm) and gcc's `-ffast-math` and `-ffloat-store` – Ben Voigt Oct 05 '11 at 03:17
Thanks Ben, You and / or curiousguy should have written a detailed answer, and then pointed to that in the comments on other answers ;-) – jman Oct 06 '11 at 21:17

score -1 · Answer 9 · answered Oct 05 '11 at 02:40

As per the other answers you should be able to rely on the compiler doing the right thing -- most compilers allow you to compile and inspect the assembler (use -S for gcc) -- you may want to do that to make sure you get the order of operation you expect.

Different optimization levels (in gcc, -O _O2 etc) allows code to be re-arranged (however sequential code like this is unlikely to be affected) -- but I would suggest you should then isolate that particular part of code into a separate file, so that you can control the optimization level for just the calculation.

"_As per the other answers you should be able to rely on the compiler doing the right thing_" As per the other comments, you should not. — curiousguy, Oct 05 '11 at 03:23

score -1 · Answer 10 · answered Oct 05 '11 at 02:55

The short answer is: the compiler will probably change the order of your calculations, but it will never change the behavior of your program (unless your code makes use of expression with undefined behavior: http://blog.regehr.org/archives/213)

However, you can still influence this behavior by deactivating all compiler optimizations (option "-O0" with gcc). If you still needs the compiler to optimize the rest of your code, you may put this function in a separate ".c" which you can compile with "-O0". Additionally, you can use some hacks. For instance, if you interleaves your code with extern function calls the compiler may consider that it is unsafe to re-order your code as the function may have unknown side-effect. Calling "printf" to print the value of your intermediate results will conduct to similar behavior.

Anyway, unless you have any very good reason (e.g. debugging) you typically don't want to care about that, and you should trust the compiler.

"_it will never change the behavior of your program_" This is false. **FP behaviour** is not consistent on most compiler by default. And apparently GCC does not even provide a standard conformant mode! — curiousguy, Oct 05 '11 at 03:21

Can I guarantee the C++ compiler will not reorder my calculations?

10 Answers10