41

Why do compilers seems to be polite toward loops that do nothing and do not eliminate them?

Does the C standard require loops to take some time?

Example, the following code:

void foo(void) {
    while(1) {
        for(int k = 0; k < 1000000000; ++k);
        printf("Foo\n");
    }
}

runs slower than this one:

void foo(void) {
    while(1) {
        for(int k = 0; k < 1000; ++k);
        printf("Foo\n");
    }
}

even with -O3 optimization level. I would expect removing empty loops allowed and thus get the same speed on both codes.

Is "time spent" a side effect that should be preserved by a compiler?

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
Nonyme
  • 1,220
  • 1
  • 11
  • 22
  • 6
    What is an rude compiler? – Ed Heal Apr 23 '16 at 17:39
  • 4
    I don't think the C/C++ standards have any notion of "running time". –  Apr 23 '16 at 17:40
  • @EdHeal, the one that spits errors onto you, I guess :) – ForceBru Apr 23 '16 at 17:41
  • 9
    A rude compiler is a compiler which pays great attention to turning any "undefined behavior" in your code into your worst nightmare. – BitTickler Apr 23 '16 at 17:41
  • 19
    This is a decent question about the as-if rule and what counts as observable behaviour. – Lightness Races in Orbit Apr 23 '16 at 17:41
  • 1
    The first one runs slower? I cannot reproduce on gcc. [1](https://godbolt.org/g/caZ8BS) [2](https://godbolt.org/g/BQkvjx). Clang does not optimize out apparently though. The difference between those two compilers hints that there is no guarantee by the standard in any case. – Banex Apr 23 '16 at 17:43
  • 2
    You just measuring startup overhead or other overhead which are hardly predictable. Basically random numbers. You should consult disassembly before asking such questions. – Ivan Aksamentov - Drop Apr 23 '16 at 17:46
  • Ah, I did not see that gcc optimize it out. I was stupid to try it on OS X where gcc is actually clang in disguise. – Nonyme Apr 23 '16 at 17:48
  • Re "I would expect removing empty loops", you have to remember that what you expect is not necessarily what you get. As Drop says, you have to look at the resulting code to see what the compiler actually did. – jamesqf Apr 23 '16 at 18:20
  • Fun - incredibly sad - fact: That code inhibits undefined behavior. So if one wanted to nitpick, they might say that it's not a valid c++ program to begin with. This might not even be pure nitpicking, because this could very well influence the optimizer of different compilers. – Voo Apr 23 '16 at 21:09
  • 1
    Possible duplicate of [Can an ANSI C compiler remove a delay loop?](http://stackoverflow.com/questions/14449098/can-an-ansi-c-compiler-remove-a-delay-loop) – detly Apr 24 '16 at 00:06
  • @Voo Where's the undefined behaviour in these snippets? – marcelm Apr 24 '16 at 17:45
  • 1
    @marcelm The infinite loop without observable side effects (writes to global variables btw doesn't count) causes undefined behavior in C++ and is a grey area in C. Yes it's insane – Voo Apr 24 '16 at 17:53
  • Busy waiting with an empty loop is an offense against threadkind. –  Apr 26 '16 at 18:13
  • A ; is a statement in C++ – Damian Apr 26 '16 at 19:22
  • @Voo Well, I learned a thing or two about UB (mainly from [here](http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html)). But, wouldn't the `printf()` in the infinite loop qualify as a side effect? Or rather, since `printf()` defers to `write()` which calls kernel code, wouldn't it be impossible for the compiler to prove the loop _doesn't_ have side effects, forcing it to treat it as if it does? – marcelm Apr 27 '16 at 19:41
  • @marcelm I somehow missed the printf(), don't ask me how. – Voo Apr 27 '16 at 20:18
  • Could you please give timing figures and the compiler(s) you have tested with? Probably also that you mean to include the C++ standard in your question (the body only mentions the C standard). – PJTraill Jun 16 '16 at 08:38

4 Answers4

43

No, time spent does not count as observable behaviour to be protected by the as-if rule:

[C++14: 1.8/5]: A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input. However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).

[C++14: 1.5/8]: The least requirements on a conforming implementation are:

  • Access to volatile objects are evaluated strictly according to the rules of the abstract machine.
  • At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.
  • The input and output dynamics of interactive devices shall take place in such a fashion that prompting output is actually delivered before a program waits for input. What constitutes an interactive device is implementation-defined.

These collectively are referred to as the observable behavior of the program. [ Note: More stringent correspondences between abstract and actual semantics may be defined by each implementation. —end note ]

Those loops can be legally optimised out and, indeed, there are scenarios in which the standard makes deliberate attempts to make doing so even easier:

[C++14: 1.10/24]: The implementation may assume that any thread will eventually do one of the following:

  • terminate,
  • make a call to a library I/O function,
  • access or modify a volatile object, or
  • perform a synchronization operation or an atomic operation.

[ Note: This is intended to allow compiler transformations such as removal of empty loops, even when termination cannot be proven. —end note ]

Your compiler may in fact be being "polite" in noticing that the intent of the loop in these programs appears to be in slowing down the emission of repeated text output. :)

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • 4
    Annoyingly, the question body itself stipulates "The C standard". – kfsone Apr 23 '16 at 19:01
  • I wish the rule had instead specified that the time required to execute a loop, *even if infinite*, is not in and of itself considered a side-effect. Such a rule would not negate the laws of causality the way the above cited rule can. For example, given `do {x=expr_that_might_never_yield_1(...);} while (x != 1); printf("Done..."); if (x==1) printf("Hooray!");` I can see usefulness to allowing the compiler to move the `printf("Done...")` before the loop, but not to allowing the compiler to assume that `x` will somehow equal 1. – supercat Apr 26 '16 at 07:05
  • I think it would be more "polite" to optimize the loops out to notify the programmer that this is brittle code that might break at any time. – usr Apr 26 '16 at 19:52
  • @usr: A compiler's job is not to handhold/spoonfeed. The programmer is responsible for his own learning, nobody else. – Lightness Races in Orbit Apr 27 '16 at 08:54
28

You did not specify the compiler, but let's assume it's gcc.

gcc does not remove empty loops, at least not according to the documentation. It contains the following text:

Historically, GCC has not deleted “empty” loops under the assumption that the most likely reason you would put one in a program is to have a delay, so deleting them will not make real programs run any faster.

However, it can remove empty loops if they are "emptied" by the optimizer, that is, if the loop contains code that the optimizer can move outside the loop, and the resulting loop is empty.

It is not clear from the documentation if this is still true in the most recent version. The manual mentions "historically" without specifying why. If you update your question with information about your exact platform and compiler, maybe a better answer can be given.

pipe
  • 657
  • 10
  • 27
  • Can you specify what you mean by "emptied by the optimizer"? Do you mean with optimizations enabled? – Banex Apr 23 '16 at 18:49
  • @Banex The optimizer wouldn't be active unless optimizations are enabled, so yes, with optimizations enabled. :) If a loop contains code that can be moved out of the loop or removed completely because the compiler can prove it is never used, then gcc can remove the loop completely. – pipe Apr 23 '16 at 18:51
  • Very good point. How fast would the OP's code run if he put "int i = 0;" inside the loop? – Mike Apr 23 '16 at 19:20
  • 8
    An example might help, like for `int main(void) { int i, k; for (i = 0; i < 1000000; ++i) ++k; puts("Done"); }`, the value of `k` is never used, so `k` can be optimized out. Because the increment of `k` will never happen, the loop isn't even necessary to increment `k` since that was its only job. As a result, the compiler can optimize out the loop. The value of `i` is now unused as well, so it can also be optimized out, leaving `int main(void){puts("Done");}` as the resulting optimized program. This occurs for me compiling to 64-bit Windows code in GCC 5.2.0. –  Apr 23 '16 at 19:26
  • 2
    @ChronoKitsune And even further, let's say you were returning `k` rather than ignoring it. In this specific case, the compiler could *still* remove the loop, and would simply optimize the program to something like `int main(void){return 0xf4240;}` – Ryan Pendleton Apr 26 '16 at 19:22
7

There is no minimal execution time for a C or C++ executable because execution time depends on many platform specific issues such as:

  1. Processor clock rate.
  2. Clock cycles per instruction.
  3. Internal processor execution optimizations.
  4. Interruptions.
  5. Processor instruction set / capabilities.

Some processors support multiplication, others don't. The processors that don't support multiplication would take longer to execute a program than a process that has multiplication instructions. Same with floating point.

The internal operating speed of a processor varies. There is a common unit of time measurement called a "clock cycle". Most processor vendors specify the duration of an instruction in clock cycles. This measurement may be difficult due to internal support, such as cache management.

Some processors have logic that can optimize the execution of instructions, or instruction patterns. One optimization is branch prediction.

Many platforms have interrupts. For example, there may be a "system tick" interrupt which allows the operating system to know when to switch execution to another program. Some are not so periodic, such as when I/O occurs. A minimum execution time cannot be guaranteed when the program gets interrupted.

Stating a minimum execution time would play havoc with the C and C++ language portability. Some platforms would want to execute code faster than the minimum time. Other platforms may not be able to achieve a minimum execution time (but they could benefit from a high level language like C).

Also, how would the time be measured?

Does the minimum execution time apply for delay loops or polling?

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
  • Adding to that is varying clock frequencies on many OS, depending on load. Even the term "clock cycle" in relative with regard to that. – BitTickler Apr 23 '16 at 19:25
4

No, there is no guarantee: (quotation from N1570, 5.1.2.3 Program execution)

1 The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant.

Anyway, the C standard only specifies the behaviour of your program when it is executed on an abstract machine, which can have infinite memory and/or CPUs.

nalzok
  • 14,965
  • 21
  • 72
  • 139
  • 4
    As a corollary, a "compiler" would be strictly conforming if it just translated every input to an infinite loop; arguing that it is doing the right behaviour , just extremely slowly – M.M Apr 24 '16 at 06:17
  • @M.M I think if the input terminates the output must terminate as well. There is a difference between "extremely slowly" and "never". – kasperd Apr 24 '16 at 12:56
  • @M.M No, such a compiler would not be strictly conforming. Do not confuse the inability to demonstrate that any given program will terminate (the halting problem) with the inability to do it for any particular program. If it can be shown that (for example) the compiler's translated output for `Hello, world!` will never print `Hello, world!`, the compiler is not conforming, and we can be as thorough and ingenious as we like with the proof (whereas the compiler is presumably limited in how well it can obfuscate the nontermination). – Jeroen Mostert Apr 24 '16 at 19:14
  • @JeroenMostert `find_collatz_counterexample(); printf("Hello world");` – mbrig Jun 13 '16 at 16:45
  • @mbrig: by `Hello, world!` I meant the canonical program, of course, not an arbitrary one of which we ourselves can't tell if it terminates or not. We couldn't use such a program to conclude anything about a compiler. – Jeroen Mostert Jun 13 '16 at 18:08
  • @JeroenMostert, ah, I meant that the compiler could output an equivalent of the collatz (instead of the provably infinite loop), and could then claim to be conforming (although useless, hah). – mbrig Jun 13 '16 at 19:15
  • @mbrig: well, you're not wrong. Output where the correctness is dependent on unproven conjectures is a quality-of-implementation issue. :-P – Jeroen Mostert Jun 13 '16 at 20:10