Is it legal for a C++ optimizer to reorder calls to clock()?

Question

The C++ Programming Language 4th edition, page 225 reads: A compiler may reorder code to improve performance as long as the result is identical to that of the simple order of execution. Some compilers, e.g. Visual C++ in release mode, will reorder this code:

#include <time.h>
...
auto t0 = clock();
auto r  = veryLongComputation();
auto t1 = clock();

std::cout << r << "  time: " << t1-t0 << endl;

into this form:

auto t0 = clock();
auto t1 = clock();
auto r  = veryLongComputation();

std::cout << r << "  time: " << t1-t0 << endl;

which guarantees different result than original code (zero vs. greater than zero time reported). See my other question for detailed example. Is this behavior compliant with the C++ standard?

No it is not. The compiler should expect functions to have some side effects. Some compilers have language extensions for "pure constant" functions. You could ask your compiler to show the assembler code (e.g. `g++ -O2 -S -fverbose-asm your-code.cc` with [GCC](http://gcc.gnu.org/)...) — Basile Starynkevitch, Oct 04 '14 at 06:43
*"which guarantees different result than original code (zero vs. greater than zero time reported)"* -- It doesn't guarantee that, as far as the standard is concerned, since the standard makes no mention of how long any particular operation should take (aside from calls to [sleep](http://en.cppreference.com/mwiki/index.php?title=Special%3ASearch&search=sleep) functions). The call to `veryLongComputation()` could very well be instantaneous. — Benjamin Lindley, Oct 04 '14 at 07:25
@BenjaminLindley `The call to veryLongComputation() could very well be instantaneous` - I disagree. There is great number of algorithms, which given sufficiently large data are guaranteed not to complete before the end of solar system using any of computing hardware known at the time C++ standard was written. — Paul Jurczak, Oct 04 '14 at 07:32
You could prevent this reordering (if we end up deciding that it is legal) by making `r` volatile. — M.M, Oct 04 '14 at 07:36
@MattMcNabb `volatile` was the solution suggested in the question he referenced. — Michael Petch, Oct 04 '14 at 07:39
@PaulJurczak: I fail to see how what you said disagrees with what I said. However, perhaps this will be an interesting read for you: http://stackoverflow.com/questions/3592557/optimizing-away-a-while1-in-c0x — Benjamin Lindley, Oct 04 '14 at 07:40
@BenjaminLindley It is an interesting read, but it illustrates a special rule for an empty, infinite loop. My case involves a function call with I/O side effect. I agree that standard doesn't define time cost of computations, but it defines the concept of system clock and related utilities. Additionally, it uses _big O_ algorithmic complexity in many places (see page 894). Executing my code example sequentially will produce arbitrarily large number, depending on my choice of algorithm, which can always be much bigger than the one produced by executing reordered version of the code. — Paul Jurczak, Oct 04 '14 at 08:01
@PaulJurczak: It's not a special rule for an infinite empty loop. It's a rule for any loop that satisfies certain conditions (which are listed in the question). Does your function call involve an I/O side effect? You didn't mention that in the question. — Benjamin Lindley, Oct 04 '14 at 08:06
@BenjaminLindley Call to `veryLongComputation` returns a value, which is sent to `cout` - I thought it qualified as an I/O. — Paul Jurczak, Oct 04 '14 at 08:28
@Paul what qualifies as I/O is the sending-to-cout part, not the computing-r part. What you are seeing is perfectly compliant. Whatever non-negative difference you see between the clock calls, possibly small enough to be rounded to 0, is fine. Imagine a program where the difference between the 2 calls to clock is such that the you get 0 or 1 with roughly 50% probability, depending on when you start and what else is running. Should optimizations preserve this exact probability? — Marc Glisse, Oct 04 '14 at 17:16
@MarcGlisse _Should optimizations preserve this exact probability_ - I agree that proper treatment of time as a side effect is a complex issue, but it can't be swept under the rug as if it doesn't exist. This behavior is user unfriendly to say the least. The intent expressed in the code is obvious and compiler contravenes it. It is not difficult for optimizer to figure out that value of `t1-t0` depends on code order and leave it alone. That's what gcc and clang does. VC++ does it for some functions and reorders for some others without obvious pattern. — Paul Jurczak, Oct 04 '14 at 19:40
@Paul "that's what gcc and clang do". Is it? I believe that if they don't move code across `clock`, it isn't because they think it would be wrong, it is simply that their optimization heuristics didn't find any particular reason to do it in this particular testcase, but they would happily do it on small variations. Benchmarking is not the main goal of the language, so it isn't surprising that it requires some extra work. — Marc Glisse, Oct 04 '14 at 19:53
How could you tell the difference between the two versions from the standpoint of the abstract machine? There is no guarantee whatsoever how long a computation will take. (Practical example: the compiler could fully evaluate your computation.). This reordering would be invalid if your computation performed a side-effect with a guaranteed minimum duration such as a sleep. An IO is not enough because, again, you cannot tell the difference between the reordering and a really fast IO. The as-if rule applies. — usr, Oct 04 '14 at 19:55
This is horrible! Is there any way to prevent compiler from re-ordering certain calls or skipping seemingly redundant ones (such as the 2nd call to clock())? There is a **volatile** keyword for variables to prevent compiler from "optimizing" variable access; is there an analogue of **volatile** for functions? — Michael, Oct 04 '14 at 20:46
@Michael volatile does not prevent reorderings. It prevents reorderings among volatile variables, but other computations can be rearranged. volatile is meant for side-effecting memory addresses. — usr, Oct 04 '14 at 20:53
@usr _compiler could fully evaluate your computation_ - this is the best explanation/excuse for this behavior presented so far, thanks. — Paul Jurczak, Oct 04 '14 at 21:06
@MarcGlisse `Benchmarking is not the main goal of the language` I think that misses the point. It may be benchmarking in this case, it might be auto-tuning in another. Assuming benchmarking and then dismissing it as unimportant violates the principle of least surprise — Basic, Oct 05 '14 at 19:00
@Basic: It's not very surprising at all, really. Neither function depends on any side effects of the other. They're candidates for reordering. What should disqualify them? How should the compiler know to keep the same order here, as opposed to everywhere else? Or would you propose outlawing a whole very useful class of optimizations so that benchmarks work better? — cHao, Oct 06 '14 at 02:24
@cHao _outlawing a whole very useful class of optimizations so that benchmarks work better_ that's far from what he's suggesting. Only a small number of cases would be affected by not reordering code across `clock()` call. This would require a minor constraint on optimizer, easy to implement, as there are already constraints of this sort imposed for other reasons. This would remove violation of principle of least surprise. Better yet, make C++ abstract machine aware of non-zero time cost of non-elided computations. — Paul Jurczak, Oct 06 '14 at 03:16
@PaulJurczak: The constraints imposed are mostly, if not entirely, to keep code from breaking the rules about observable behavior. There's no such violation here, and the only reason it is even an issue is because you're timing something. How is the compiler supposed to know that? Special knowledge about `clock()` can only do so much -- there are probably a half dozen functions just in Win32 to get the time in various formats and precisions...and the built-in inability to use one to write one's own equivalent of `clock()` would be pretty freaking surprising too. — cHao, Oct 06 '14 at 03:57
About the only worthwhile idea would be some keyword, pragma, or the like to disable certain optimizations in a section of code. Other solutions are too prone to weirdness, IMO. — cHao, Oct 06 '14 at 04:12
And in fact, there kinda sorta almost is such a thing. `#pragma optimize("", off)` allegedly disables all optimizations in the function(s) following it. You can reenable them with `#pragma optimize("", on)`. — cHao, Oct 06 '14 at 04:49

score 18 · Answer 1 · edited Oct 05 '14 at 09:21

18

Well, there is something called Subclause 5.1.2.3 of the C Standard [ISO/IEC 9899:2011] which states:

In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).

Therefore I really suspect that this behaviour - the one you described - is compliant with the standard.

Furthermore - the reorganization indeed has an impact on the computation result, but if you look at it from compiler perspective - it lives in the int main() world and when doing time measurements - it peeps out, asks the kernel to give it the current time, and goes back into the main world where the actual time of the outside world doesn't really matter. The clock() itself won't affect the program and variables and program behaviour won't affect that clock() function.

The clocks values are used to calculate difference between them - that is what you asked for. If there is something going on, between the two measuring, is not relevant from compilers perspective since what you asked for was clock difference and the code between the measuring won't affect the measuring as a process.

This however doesn't change the fact that the described behaviour is very unpleasant.

Even though inaccurate measurements are unpleasant, it could get much more worse and even dangerous.

Consider the following code taken from this site:

void GetData(char *MFAddr) {
    char pwd[64];
    if (GetPasswordFromUser(pwd, sizeof(pwd))) {
        if (ConnectToMainframe(MFAddr, pwd)) {
              // Interaction with mainframe
        }
    }
    memset(pwd, 0, sizeof(pwd));
}

When compiled normally, everything is OK, but if optimizations are applied, the memset call will be optimized out which may result in a serious security flaw. Why does it get optimized out? It is very simple; the compiler again thinks in its main() world and considers the memset to be a dead store since the variable pwd is not used afterwards and won't affect the program itself.

edited Oct 05 '14 at 09:21

Peter Mortensen

30,738
21
105
131

answered Oct 04 '14 at 07:51

Jendas

3,359
3
27
55

On the other hand it actually does affect the function. If it is called on a different place it returns a different value. And if that's not a side effect caused by the long computation I don't know what is. "code between the measuring won't affect the measuring as a process" that is true. But it very much affects the values. It's like saying "The value of variables does not change summation as a process". – sharpneli Oct 04 '14 at 10:38
I think that summation is not the same. Variables you are summing are pretty much exact and deterministic. The clock time, on the other hand, is not. Actually result of time difference measuring is pretty much random in some sense... – Jendas Oct 04 '14 at 10:46
The clock time is random in the sense that it can vary from execution to execution. However it's also definitely not so random that we could just swap it with rand() calls. And it's quite easy to prove that "no needed side effects are produced" doesn't apply because there is a clear side-effect: The time difference is larger. – sharpneli Oct 04 '14 at 10:52
C11 standard 5.1.2.3:2 states the things which are to be counted as a side effect. Accessing a volatile object is one of them. A clock function eventually either uses volatile (otherwise the compiler can just optimize it into a single read, thus time never changes) or it's handwritten asm which effectively does the same thing. Or it's a syscall (we can say it's equivalent to a function call) and eventually the kernel does the exact same thing, basically reads a volatile object. Unless the compiler can prove there is no access to a volatile ever in that chain it cannot change the order. – sharpneli Oct 04 '14 at 11:04
2

@sharpneli "Unless the compiler can prove [...]" but that's exactly what the compiler is doing here. If you hide the body of veryLongComputation from it, I strongly doubt it reorders. But here it looks at veryLongComputation in details, proves that there are no side effects in it, and then reorders. And no, a different running time (waving arms to say it is very different) does not count as a side effect. – Marc Glisse Oct 04 '14 at 17:35
`memset` will of course not be optimized out in the example if `pwd` is `volatile`. IOW, if you want the password clearing to be observable, you must make it so. – MSalters Oct 04 '14 at 22:47
1

@MSalters: If `pwd` is volatile, passing it to `memset` has undefined behavior. You cannot access a volatile object through a non-volatile-qualified lvalue, which is what `memset` would do. – R.. GitHub STOP HELPING ICE Oct 05 '14 at 01:10
5

Point. `std::fill` would work. A volatile pointer is still an iterator. So much for C++ being less predictable than C. – MSalters Oct 05 '14 at 13:14
"when compiled normally" is not how I'd describe an un-optimized build. Optimization is normal; you only make a braindead anti-optimized build with `gcc -O0` or something when you need fully consistent debugging. `gcc -O0` never keeps anything in registers between statements so you can modify variables at run-time with a debugger, or even `jump` to a new source line in GDB and have the program work as if you'd jumped in the C abstract machine. These extra stores/reloads have no purpose except consistent debugging, so it's far more than just "not optimizing". – Peter Cordes Jun 19 '18 at 13:06
But AFAIK there's not much middle ground where it keeps things in registers but doesn't throw away "dead stores". Maybe with `gcc -Og` ("optimize for debugging"). Not that it matters, because you want to compile with full optimization for production use. – Peter Cordes Jun 19 '18 at 13:11

MSalters · Accepted Answer · 2014-10-04T22:36:28.153

15

The compiler cannot exchange the two clock calls. t1 must be set after t0. Both calls are observable side effects. The compiler may reorder anything between those observable effects, and even over an observable side effect, as long as the observations are consistent with possible observations of an abstract machine.

Since the C++ abstract machine is not formally restricted to finite speeds, it could execute veryLongComputation() in zero time. Execution time itself is not defined as an observable effect. Real implementations may match that.

Mind you, a lot of this answer depends on the C++ standard not imposing restrictions on compilers.

edited Oct 04 '14 at 22:36

answered Oct 04 '14 at 17:43

MSalters

173,980
10
155
350

Assignments to non-volatile variables are not observable side effects. – Pete Becker Oct 04 '14 at 20:27
2

Should have been more explicit: both _calls_ are observable side effects. – MSalters Oct 04 '14 at 22:37
Here is a "how to prevent reordering question": https://stackoverflow.com/questions/37786547/enforcing-statement-order-in-c – Ciro Santilli OurBigBook.com Dec 16 '19 at 13:37

score 8 · Answer 3 · answered Oct 04 '14 at 07:13

8

Yes, it is legal - if the compiler can see the entirety of the code that occurs between the clock() calls.

answered Oct 04 '14 at 07:13

o11c

15,265
4
50
75

1

Do you have a standard quote or reference that corroborates this? – Oct 04 '14 at 07:14
2

@remyabel It's allowed by the quote in the original question. The key is recognizing that the call to `clock` is not what's being reordered - it's the computation that is. By the as-if rule, any transparent computation can be computed at any time. – o11c Oct 04 '14 at 07:17
1

More to the point, if the compiler can guarantee that `veryLongComputation()` causes no observable behaviour. – M.M Oct 04 '14 at 07:35
@o11c _It's allowed by the quote in the original question_ - but the result is different, which is a common sense contradiction to that quote. Are time related quantities exempt from this rule? – Paul Jurczak Oct 04 '14 at 08:34
@MattMcNabb _if the compiler can guarantee that veryLongComputation() causes no observable behaviour_ - so computation time is not an observable quantity? – Paul Jurczak Oct 04 '14 at 08:36
2

@PaulJurczak that's correct; observable behaviour is writes of volatile variables, and calls to library functions – M.M Oct 04 '14 at 08:51
@MattMcNabb: I don't even see where calls to library functions count. The minimum requirements listed in C++11 are accesses to volatile variables, file I/O, and that in interactive cases, that output is flushed before input can begin. – cHao Oct 05 '14 at 00:45
@cHao any call to a function that is not fully seen must be assumed to be that though. – o11c Oct 05 '14 at 00:49
@o11c: "Must" is a pretty strong word. :) All the compiler would theoretically need in order to optimize a library call away, is the code of the library being used (and, of course, proof that the call wouldn't affect observable behavior). The biggest reason it doesn't happen in most cases, is that it's pretty complicated to do. – cHao Oct 05 '14 at 00:52

score 4 · Answer 4 · answered Oct 04 '14 at 20:53

If veryLongComputation() internally performs any opaque function call, then no, because the compiler cannot guarantee that its side effects would be interchangeable with those of clock().

Otherwise, yes, it is interchangeable.
This is the price you pay for using a language in which time isn't a first-class entity.

Note that memory allocation (such as new) can fall in this category, as allocation function can be defined in a different translation unit and not compiled until the current translation unit is already compiled. So, if you merely allocate memory, the compiler is forced to treat the allocation and deallocation as worst-case barriers for everything -- clock(), memory barriers, and everything else -- unless it already has the code for the memory allocator and can prove that this is not necessary. In practice I don't think any compiler actually looks at the allocator code to try to prove this, so these types of function calls serve as barriers in practice.

Why is functions not being first-class entities the problem? — harold, Oct 05 '14 at 13:15
@harold: I said nothing about functions being first-class entities. — user541686, Oct 05 '14 at 18:35

score 2 · Answer 5 · answered Oct 04 '14 at 15:07

2

At least by my reading, no, this is not allowed. The requirement from the standard is (§1.9/14):

Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.

The degree to which the compiler is free to reorder beyond that is defined by the "as-if" rule (§1.9/1):

This International Standard places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.

That leaves the question of whether the behavior in question (the output written by cout) is officially observable behavior. The short answer is that yes, it is (§1.9/8):

The least requirements on a conforming implementation are:
[...]
— At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.

At least as I read it, that means the calls to clock could be rearranged compared to the execution of your long computation if and only if it still produced identical output to executing the calls in order.

If, however, you wanted to take extra steps to ensure correct behavior, you could take advantage of one other provision (also §1.9/8):

— Access to volatile objects are evaluated strictly according to the rules of the abstract machine.

To take advantage of this, you'd modify your code slightly to become something like:

auto volatile t0 = clock();
auto volatile r  = veryLongComputation();
auto volatile t1 = clock();

Now, instead of having to base the conclusion on three separate sections of the standard, and still having only a fairly certain answer, we can look at exactly one sentence, and have an absolutely certain answer--with this code, re-ordering uses of clock vs., the long computation is clearly prohibited.

answered Oct 04 '14 at 15:07

Jerry Coffin

476,176
80
629
1,111

5

By this logic, even optimising `1 + 1` to `2` is invalid, because code *might* be timing the calculation. I don't think the logic is correct. – Oct 04 '14 at 15:56
You seem to be reading incorrectly. "Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated." That says precisely nothing about constant folding. – Jerry Coffin Oct 04 '14 at 15:59
Your answer claims that if code is timing the calculation, and printing the time, the precise printed time must not change because of optimisations, because the precise printed time is observable. – Oct 04 '14 at 16:04
Not so. My answer claims that it *can* make some changes, such as changing the order or evaluation of subexpressions within an expression, but not others such as the order of evaluation of full expressions. – Jerry Coffin Oct 04 '14 at 16:44
This just doesn't make sense. The abstract semantics allow **any** amount of time to elapse for **any** statement (short of a `sleep` of course). The abstract machine sets the range of allowed outcomes, not some "unoptimized" build. – MSalters Oct 04 '14 at 17:35
@MSalters: Who said anything about an "'unoptimized' build?" – Jerry Coffin Oct 04 '14 at 18:10
Unoptimized in the sense that no calls have been reordered, which is an action performed by the optimizer. – MSalters Oct 04 '14 at 18:44
@MSalters: So your position is that: "Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated." ...is basically non-binding advice, not really a requirement at all. – Jerry Coffin Oct 04 '14 at 18:48
@JerryCoffin That's merely what makes `i = 1; i = 2;` have the net effect of setting `i` to `2`, instead of allowing the two stores to overlap and setting `i` to some garbage value. – Oct 04 '14 at 20:26
@hvd: Quite the contrary--it's much more than that. It means that when you call `clock` it has to give you the value from the clock *after* preceding full expressions, and *before* subsequent ones. – Jerry Coffin Oct 04 '14 at 20:45
@JerryCoffin: You still have to show that there's a link between order of observable effects and sequencing. While that may appear to be intuitively right, nobody has come up with the relevant standardese proving that. – MSalters Oct 04 '14 at 22:44
@MSalters: No, I don't. You seem to be mistaking "side effects" in §1.9/14 with "observable side effects" in §1.9/1 and §1.9/8. As you seem to be interpreting things, `clock_t clock() { return (clock_t)rand(); }` would apparently be a legitimate definition of `clock()`, because there's no circumstance under which you could depend on its output being predictable at all. – Jerry Coffin Oct 04 '14 at 23:15
As explained in the comments of the earlier question, your final fix (adding volatile on each line) is not sufficient. `auto volatile r=fun()` is equivalent to `auto tmp=fun(); auto volatile r=tmp;` and the compiler may move the first part before `clock`. – Marc Glisse Oct 05 '14 at 07:41
@MarcGlisse: Citation needed. – Jerry Coffin Oct 05 '14 at 11:14
@JerryCoffin: My reading of C++11 §1.9/1 (and the intent as well, considering the footnote) is that the compiler can ignore every rule in the standard except for §1.9/8 (ie: as long as observable behavior is maintained). Since `veryLongComputation` has no side effects that are visible to `clock` (and vice versa), sequencing need not be respected -- the compiler can call functions wherever it pleases -- as long as any I/O, and the reads/writes to `t0`, `r`, and `t1`, happen exactly as the abstract machine could have done them. – cHao Oct 06 '14 at 01:38
@cHao: So that puts us back at `clock` being equivalent to `rand()`--all output from it is basically random numbers. – Jerry Coffin Oct 06 '14 at 04:42
@JerryCoffin: Inasmuch as it relates to timing the execution of code with no visible side effects? Basically, yeah. Can't be trusted. – cHao Oct 06 '14 at 04:56
IMO, that leads to only one reasonable conclusion: you're overreading the permissiveness of the "as-if" rule to allow a great deal that it really doesn't, and was never intended to either. Don't get me wrong: games like this can be fun, and 20 years ago (or so) I'd probably have been playing along. I won't try to say whether my growing reluctance is wisdom or boredom, but they don't strike me as nearly as useful or fun any more. – Jerry Coffin Oct 06 '14 at 05:08
I'm reading it as written -- and as expressly stated in the footnotes, which read: "This provision is sometimes called the “as-if” rule, because an implementation is free to disregard any requirement of this International Standard as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program." – cHao Oct 06 '14 at 05:10
@cHao: I don't think you are. It seems clear to me that in this case the observable behavior *is* affected. – Jerry Coffin Oct 06 '14 at 05:12
It's not, though. From C++'s perspective, time is not observable behavior. If `veryLongComputation` actually *did* something -- say, if it wrote to a volatile global variable itself -- you could more or less ensure that stuff ran in the sequence it shows in the code. But since it doesn't actually affect anything, C++ is free to run it whenever it's convenient and write the result later. – cHao Oct 06 '14 at 05:18
As for the language-lawyer games, i'll admit i'm having fun looking for loopholes. :) But questions like this one are precisely why you pretty much have to read the standard with a lawyer mindset. – cHao Oct 06 '14 at 05:51
@cHao: "And yet it moves." – Jerry Coffin Oct 06 '14 at 06:02
2

@JerryCoffin At any rate, this isn't a typical language lawyer issue: this isn't about wording where some hypothetical implementation might possibly misuse the improper wording, this is where a real-world compiler optimises based on the exact wording of the standard, and where pretty much every optimising compiler there is has similar optimisations even if they don't show for this particular test program. Even if you conclude (whether rightly or wrongly) that the standard doesn't intend to allow these optimisations, code that expects to be portable to current implementations must deal with it. – Oct 06 '14 at 09:02

score 1 · Answer 6 · answered Oct 08 '14 at 03:01

Let's suppose that the sequence is in a loop, and the veryLongComputation () randomly throws an exception. Then how many t0s and t1s will be calculated? Does it pre-calculate the random variables and reorder based on the precalculation - sometimes reordering and sometimes not?

Is the compiler smart enough to know that just a memory read is a read from shared memory. The read is a measure of how far the control rods have moved in a nuclear reactor. The clock calls are used to control the speed at which they are moved.

Or maybe the timing is controlling the grinding of a Hubble telescope mirror. LOL

Moving clock calls around seems too dangerous to leave to the decisions of compiler writers. So if it is legal, perhaps the standard is flawed.

IMO.

If the timing is important, you should be using a more trustworthy mechanism than timing do-nothing code. And/or disable all optimizations -- most of what the optimizer does involves reordering instructions. — cHao, Oct 09 '14 at 03:55

score -1 · Answer 7 · answered Oct 04 '14 at 16:03

It is certainly not allowed, since it changes, as you have noted, the observeable behavior (different output) of the program (I won't go into the hypothetical case that veryLongComputation() might not consume any measurable time -- given the function's name, is presumably not the case. But even if that was the case, it wouldn't really matter). You wouldn't expect that it is allowable to reorder fopen and fwrite, would you.

Both t0 and t1 are used in outputting t1-t0. Therefore, the initializer expressions for both t0 and t1 must be executed, and doing so must follow all standard rules. The result of the function is used, so it is not possible to optimize out the function call, though it doesn't directly depend on t1 or vice versa, so one might naively be inclined to think that it's legal to move it around, why not. Maybe after the initialization of t1, which doesn't depend on the calculation?
Indirectly, however, the result of t1 does of course depend on side effects by veryLongComputation() (notably the computation taking time, if nothing else), which is exactly one of the reasons that there exist such a thing as "sequence point".

There are three "end of expression" sequence points (plus three "end of function" and "end of initializer" SPs), and at every sequence point it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed.
There is no way you can keep this promise if you move around the three statements, since the possible side effects of all functions called are not known. The compiler is only allowed to optimize if it can guarantee that it will keep the promise up. It can't, since the library functions are opaque, their code isn't available (nor is the code within veryLongComputation, necessarily known in that translation unit).

Compilers do however sometimes have "special knowledge" about library functions, such as some functions will not return or may return twice (think exit or setjmp).
However, since every non-empty, non-trivial function (and veryLongComputation is quite non-trivial from its name) will consume time, a compiler having "special knowledge" about the otherwise opaque clock library function would in fact have to be explicitly disallowed from reordering calls around this one, knowing that doing so not only may, but will affect the results.

Now the interesting question is why does the compiler do this anyway? I can think of two possibilities. Maybe your code triggers a "looks like benchmark" heuristic and the compiler is trying to cheat, who knows. It wouldn't be the first time (think SPEC2000/179.art, or SunSpider for two historic examples). The other possibility would be that somewhere inside veryLongComputation(), you inadvertedly invoke undefined behavior. In that case, the compiler's behavior would even be legal.

I think you're mistakingly assuming that MSVC does not see the body of `veryLongComputation`. MSVC has Link Time Code Generation, which essentially means it can see that body. — MSalters, Oct 04 '14 at 17:45
@MSalters: That may be so, but it is irrelevant insofar as it does not see the code of `clock`. In order for the optimization to be legal, it must be able to prove that there are no side effects occurring in _any_ of the functions, and it can't do that (since it needs to see _each and every_ function for that). Also, it is necessary that the observable behavior is identical. Which, too, is demonstrably not the case. — Damon, Oct 06 '14 at 07:00
The call to `clock` **is** observable (all library calls are). No need to prove anything further. And no, the behavior does not need to be identical to the non-optimized version. The standard merely requires that the outcome is identical to one of the possible outcomes of the abstract machine, and running within one unit of `time_t` is definitely possible and thus a valid outcome. — MSalters, Oct 06 '14 at 08:50
Nitpick: Library I/O calls are side effects. (§1.9/12) Library calls that don't do I/O aren't. And even the I/O functions don't necessarily have to be called, unless the call affects observable behavior. — cHao, Oct 07 '14 at 01:43

Is it legal for a C++ optimizer to reorder calls to clock()?

7 Answers7

Linked