4

I am trying to understand what optimization process causes the following code to produce an infinite loop when compiled with the -O3 optimization flag. To get it out of the way, I understand that the real root cause of the issue is the lack of a return in this non void function, I happened on this interesting behavior while part way through implementing this code on an embedded system and had not yet added the return as I was not using the return value at that point.

My question is more about the optimization process and how whatever it's doing could help in other cases/what the 'optimized' logic looks like.

For a little more context, I see this behavior when using both the c++ compiler in ubuntu (c++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0) as well as the aarch64-linux-gnu-g++ compiler shipped with Xilinx Vitis 2020.2 (and running on their respective platforms of course).

Minimum reproducible example (that I have so far created):

#include <iostream>

int broken_for_loop(){
    for (int i = 0; i < 10000; i+= 1000){
        std::cout << i << std::endl;
    }
}

int main(int argc, char const *argv[]){
    broken_for_loop();
}

When compiling with c++ ./broken_loop_test.cpp -o test_local -O3 or the ARM equiv. the output of the loop is infinite and I have run it until the 32b int wraps around. Without optimization, it works as I would expect. If I simply return 0 after the for loop, it also works with optimization.

My naive suspicion is that because there is no return outside the loop, the compiler expects that I will return from or break out from inside the loop, and therefor removes the check/branch that would test the loop condition, but I was wondering what I could look into to get more information on this specific topic (and optimization in general, it's been a while since my last course in compiler design) and I am not comfortable enough with ASM to confidently identify the issue there.

Any help would be appreciated, thank you!

Because this section is required, I will note that I have tried declaring volatile i as well as using different types of integer as well as casting the constant value and doing more/less in the loop. All lead to the same behavior without a return statement.

Douglas B
  • 585
  • 2
  • 13
  • 3
    Missing to return some int in your function would make code UB... – Jarod42 Nov 29 '22 at 15:15
  • 6
    You break the language rules, the language rules break you back ;) – NathanOliver Nov 29 '22 at 15:15
  • 2
    Without a valid `return` statement, the code is ill-formed so (presumably) all bets are off. – Adrian Mole Nov 29 '22 at 15:15
  • ... but maybe the compiler is using the register that *should* have the return value also as the loop index. Or some such. – Adrian Mole Nov 29 '22 at 15:17
  • If you add a valid `return` statement after the loop, and the number of loop iterations becomes finite, then that will indicate the compiler/optimiser is detecting the presence of undefined behaviour (missing return), deeming that the behaviour cannot be undefined, and making the function loop infinite [which it is permitted to do when behaviour is actually undefined]. If the behaviour is the *same* with a valid return statement, then you have a compiler bug. I'm prepared to bet it is not a compiler bug. – Peter Nov 29 '22 at 15:29
  • 2
    Without a `return`, g++ and clang both remove the loop boundary check, and does not return from the function at all. A not-unreasonable line of reasoning could be "this function never returns, so the loop must never terminate, so there is no need to check the loop condition". – molbdnilo Nov 29 '22 at 15:36
  • @molbdnilo Thank you for reading and actually attempting to answer my question, this is more the kind of the clarification/confirmation of suspicion that I was hoping for. – Douglas B Nov 29 '22 at 15:40
  • Next time you compile something allow your gcc compiler to tell you some warnings. Even open source code editors ... like for example Qt creator warn about the actual issue. – Öö Tiib Nov 29 '22 at 15:48
  • Key term: [The As-If Rule](https://en.cppreference.com/w/cpp/language/as_if). – user4581301 Nov 29 '22 at 16:21

1 Answers1

6

Undefined behaviour results in time travel.

Your code is a good example of it.

You have undefined behaviour after the loop (a function that must return a value doesn't). Since the compiler is allowed to assume that UB never happens, it assumes that the loop never terminates, and compiles it accordingy.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • Is it possible you could point me towards any information on the optimization side of things? I understand, as stated, that the UB/lack of a return is the root issue, but my question is more about what optimization process is causing the UB to become problematic only WITH optimization in my case, and where it would appear when following the language rules to a T. I have already long since solved the issue and am trying to satisfy my curiosity at this point. – Douglas B Nov 29 '22 at 15:51
  • 1
    UB is *always* problematic because the compiler is *always* allowed to do *anything*. Are you asking why that *anything* happens to coincide with "works as I would expect" when not optimising? Because you got lucky that one time. The police looked the other way. The compiler did not notice that all execution paths through the end of the loop result in UB, because the logic that detects such things is only on when optimisations are on. – n. m. could be an AI Nov 29 '22 at 16:08
  • "works as ...expect" etc were a bad choice of words borne of often testing functions like this mid impl. (which return a value that I am not using yet, so I expect it based on experiences to return garbage which would be ok for the time being). I would say I'm curious what specific optimization process/flag cause it to detect and run with the UB in this case, and if/where those flags would normally increase performance in clean code. After having time to read the linked text; I understand it as "it removes the ability to reach UB" but am not fully sure why that's only in the optimized version – Douglas B Nov 29 '22 at 16:33
  • "what specific optimization process/flag" If you want to open gcc source code and find the line that is responsible for triggering this, then I'm afraid I can't help you. Do it alone or find a gcc developer who is familiar with it and is willing to explain. "not fully sure why that's only in the optimized version" The standard does not prescribe a single behaviour for a program. It allows a range of behaviours (sometimes, like here, *any* behaviour). The compiler is free to choose between those based on whatever it wants, including optimisation settings, or phase of the moon. – n. m. could be an AI Nov 29 '22 at 16:47
  • 1
    @DouglasB: `-O0` compiles each C statement separately, with no constant-propagation. Single-stepping with a debugger, exiting the loop *is* a visible effect, and `gcc -O0` gives consistent debugging. [Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?](https://stackoverflow.com/q/53366394) - Even in a debugger, though, there's no real way to get out of this function without UB, since there's no return statement in it anywhere you could `j` to, or value you could set with `set i = - 1234` that would help. – Peter Cordes Nov 29 '22 at 20:23
  • 1
    @DouglasB: `gcc -O0` might just have a different default for how to handle this instance of compile-time-visible UB, since it does emit a `ret` instruction instead of letting execution fall into whatever function comes next (i.e. not emitting any `ret` instruction, like in `int foo() {}` which compiles to literally zero instructions, not even a `ret`, with optimization). https://godbolt.org/z/Kh64fqo3v – Peter Cordes Nov 29 '22 at 20:30
  • What's funny is that if a program knows (e.g. based on a passed argument) that the return value from a particular invocation will be ignored, machine code that returns without setting a return value can often be more efficient than machine code that sets a meaningless value. If a language specified behavior in such cases, a program that exploited that could be more efficient than would be possible absent that behavioral specification. Not a huge performance difference, but I would think people who are more interested in actual performance than in breaking people's code would ensure... – supercat Nov 30 '22 at 22:00
  • ...that programmers had ways to avoid instructions the programmers know aren't necessary. – supercat Nov 30 '22 at 22:05