GCC Why Non Run Optimization All Time?

Question

I wrote the well known swap function in C and watched the assembly output using gcc S and once again did the same but with optimizations of O2

The difference was pretty big as I saw only 5 lines compared to 20 lines.

My question is, If optimisation really helps what's the reason of not using it all the time? Why we non non-optimized compilation of code?

An extra question to those working in the industry, when you release the final version of your program after testing it do you compile with optimizations on?

I am responding to all your comments, please read them.

Optimized code is harder to debug. Also there are different types of optimizations that can be selected, so the compiler won't enforce a specific one. — Eugene Sh., Jul 05 '21 at 18:28
Also, sometimes there are bugs in optimizers. (But then you could argue that optimizations could be turned on by default, with the ability to disable.) — ikegami, Jul 05 '21 at 18:35
@ikegami Personally never heard of such bug, do you have some examples, links to share maybe — , Jul 05 '21 at 18:42
@EugeneSh. ok we can use un-optimized code for debugging and when I'm sure everything is fine use optimized for release version of my program. Plus I didn't get what is harder? you run the debugger on the high language like C not assembly... — , Jul 05 '21 at 18:43
@daniel Optimizing doesn't always optimize for speed. You can opttimize for size too. It's often hard to combine those. Sometimes you just want to compile. If it takes 30 minutes longer to optimize it - you may not want that. — Ted Lyngmo, Jul 05 '21 at 18:43
@TedLyngmo agree, but I'm talking about speed optimizations in more specefic — , Jul 05 '21 at 18:44
@daniel, None recently, but I've encountered a number of such bugs in the last 30 years. The worse thing is that these are usually [heisenbugs](https://en.wikipedia.org/wiki/Heisenbug) until you realize it's a problem with the optimized code (since debugging is often done without the optimizations). — ikegami, Jul 05 '21 at 18:44
@daniel You seem to only be considering the benefits of optimization. You need to consider whether the benefits outweigh the cost. — user3386109, Jul 05 '21 at 18:45
@daniel *to make your code faster -- who doesn't want that?* I honestly don't care about optimizations, most of the time. Computers are insanely fast these days. Ordinary (not specially optimized) code is almost always fast enough for me. — Steve Summit, Jul 05 '21 at 18:46
@SteveSummit In big programs like OS or Microsoft Word it will be huge difference and may be the reason of millions of dollars as extra income for taking care of speed especially on old pcs — , Jul 05 '21 at 18:50
@daniel That's exactly the flow. You debug the unoptimized code, then turn on the optimization. Then you notice that something got broken due to the optimization :) That's a sign of a bug causing undefined behavior. As for what is hard - optimizations are rearranging the code, removing variables and whole code sections and so on, so the resulting assembly is not having the one-to-one correspondence with the C code. So you might find yourself trying to examining a non-existent variable value, or will notice out of order source code execution. — Eugene Sh., Jul 05 '21 at 18:50
> you run the debugger on the high language like C not assembly... No, you do not. If you compile a debugging binary, the binary contains debugging symbols, that, roughly said, contain information of what instructions correspond to what source code lines. What if the optimizer optimizes half your source code away or makes a bunch of structural changes? — lulle2007200, Jul 05 '21 at 18:50
@lulle I am studying CS and we never debugged assembly... what's the reason for that at all. you just debug your code on CLion and that's it — , Jul 05 '21 at 18:53
@EugeneSh. you missed my point, debugging is on higher level languages not assembly. — , Jul 05 '21 at 18:53
@daniel *In big programs like OS or Microsoft Word* I didn't say nobody cared about optimization. Clearly lots of people care about optimization. But you had said, "who doesn't want that?", as if it's obvious to you that everyone must care about optimization -- and I'm here to tell you that I don't. — Steve Summit, Jul 05 '21 at 18:54
@daniel When you compile with optimization enabled (depanding on what exact optimization the optimizer performs), source level debugging becomes/can become effectively useless. You don't have a 1 to 1 correspondence between high level source code and the produced machine code anymore. — lulle2007200, Jul 05 '21 at 18:55
@daniel Your mind seems to be made up that optimization should be turned on all the time. Your time might be better spent making this suggestion to compiler vendors. I'm not sure why you're asking about it here, because you don't seem to be that interested in the answers you're getting. — Steve Summit, Jul 05 '21 at 18:55
Also, if you look into compiler internals, there are *tons* of different optimizations. The optimization compiler options enable a subset of those. You can optimize for different things, speed, size, "debuggability", a mix of everything. Without knowing what your end goal is, it is hard to tell you what to enable. — lulle2007200, Jul 05 '21 at 18:57
@daniel As lulle explained - you do not debug the "high-level" source code, you debug the compiled program, which the compiler might have tried its best to cross-reference with the source code using the debugging symbols. But as I explained it won't work well for a strongly optimized code as some portions are simply too different. — Eugene Sh., Jul 05 '21 at 19:00
Re. "_I am responding to all your comments, please read them._" : Anything relevant to the question should be edited into the question., anything that constitutes an answer should be posted as such, and no one should be required to read the comments other than you and anyone you are directly responding to. SO is not a discussion forum, and comments are best used to develop the question and facilitate an answer. Not to actually ask or answer questions. — Clifford, Jul 05 '21 at 19:21
Debugging with highly-optimized code is indeed very misleading. With OO languages, for example, this/self can show up as null in the debugger until you have stepped a few lines into a method. There are enough problems, dead-ends and red-herrings when debugging non-optimized builds without adding extra FUD:( — Martin James, Jul 05 '21 at 19:55

klutt · Answer 1 · 2021-07-06T13:08:39.847

There are a few reasons.

1. Compilation takes longer time

For small and even medium sized projects, this is rarely an issue today. Modern computers are VERY fast. If it takes five or ten seconds usually does not matter. But for larger projects it does matter. Especially if the build process is not setup properly. I remember when I was trying to add a feature to the game The Battle for Wesnoth. Compilation took around ten minutes. It's easy to see how much you would want to reduce that to five minutes or lower if you could.

2. Optimized code is harder to debug

The reason that it makes code harder to debug is that the debugger does not run the program line by line. That's just an illusion. Here is an example where it might be a problem:

int main(void) {
    char str[] = "Hello, World!";
    
    int number_of_capital_letters = 0;

    for(int i=0; i<strlen(str); i++) {
        if(isupper(str[i]))
            number_of_capital_letters++;
    }

    printf("%s\n", str);

    // Outcommented for debugging reasons
    // printf("%d\n", number_of_capital_letters);
}

You fire up your debugger and wonders why it does not keep track of number_of_capital_letters. And then you find out that since you have commented out the last printf statement, the variable is not used for any observable behavior so the optimizer changes your code to:

int main(void) {
    puts("Hello, World!");
}

One could argue that you then just turn off optimizer for a debug build. And that's true in the world when a cow is a sphere. But a third reason is

3. Sometimes bugs only show up at higher optimization levels.

Imagine that you have a big code base. When you upgrade the compiler, a bug suddenly emerges. And it seems to vanish when you remove optimization. What's the problem here? Well, it could be a bug in the optimizer. But it could also be a bug in your code that manifested itself with the new version of the optimizer. Very often, code with undefined behavior behaves different in code compiled with optimization.

So what do you do? You could try to figure out if the bug is in the optimizer or your code. That can be a VERY time consuming task. Let's assume it's a bug in the optimizer. What to do? You could downgrade your compiler, which is not optimal for several reasons. Especially if it's an open source project. Imagine downloading the source and then run the build script and scratching your head for hours to figure out what's wrong, and then you see in some documentation (provided that the author documented it) that you need a specific version of a specific compiler.

Let's instead assume it's a bug in your code. The ideal thing is of course to fix it. But maybe you don't have the resources to do so. This time you can also require anyone who compiles it to use a certain version of a specific compiler.

But if you could just edit a Makefile and replace -O3 with -O2, you can clearly see that it's a viable option sometimes in our non-ideal world where time is not an endless resource. With a bit of bad luck, such a bug can take a week to track down. Or more. That's time you can spend somewhere else.

Here is an example of such a bug:

#include <stdio.h>

int main(void) {
    char str[] = "Hello";
    str[5] = '!';
    puts(str);
}

When I compiled this with gcc 10.2 I got different results depending on optimization level.

Without optimization:

Hello!

With optimization:

Hello!`@

Try it out yourself:

https://godbolt.org/z/5dcKKrEW1

https://godbolt.org/z/48bz5ae1d

And here I found a forum thread where the debug build works but not release: https://developer.apple.com/forums/thread/15112

4. Sometimes bugs only show up at LOWER optimization levels.

Yep, that may also happen. In this case, you could just increase the optimization if you don't care that much about correctness. But if you do care, this can be a way to find bugs. If your code runs correctly both with and without optimization, it's more likely to not contain bugs that will haunt you in the future compared to if you only have compiled with optimization.

I did not find an example that worked, but this might theoretically do.

int main(void) {
    if(1/0) // Division by zero
        puts("An error has occurred");
    else
        puts("Everything is fine");
}

If this is compiled without optimization, it's a high probability that it will crash. But the optimizer might assume that undefined behavior (like division by zero) never occurs, so it optimizes the code to just:

int main(void) {
    puts("Everything is fine");
}

Assume that 1/0 is some kind of error check that is very unlikely to evaluate to true, so you would normally assume the program prints "Everything is fine". Here, the optimizer hides a bug.

5. The optimizer might produce a binary that's bigger in size, or is using more memory. Or something else that's not desirable.

This sometimes matters. Especially in embedded systems. Usually (always) -O0 produces very big code, but you might want to use -Os (optimize for size instead of speed) instead of -O3 to get a small binary. And sometimes also to get faster code. See below.

6. The optimizer might produce slower code

Yep, really. It's not often, but it may happen. A related but not equivalent example is illustrated in this question where the compiler generates faster code when optimizing for size of executable than speed.

Also, you can expand each optimization level out to a set of `-f` options and bisect them to find a specific optimization that's causing the issue. Though of course you should compile with `-fsanitize=address,undefined` first. — o11c, Jul 05 '21 at 19:21
1) `Optimized code is harder to debug` Back to my point :) when you debug you debug the C code not the assembly code. For example using Clion and debugging your project so how optimization here is relevant at all? — , Jul 06 '21 at 09:58
2) `Without optimization:` a little bit unrelated, but why the output is Hello!? when you wrote ! in index 5 you wrote on top of the null char so when printing it no way to know where to stop... — , Jul 06 '21 at 09:59
@daniel 1) I think they explained that pretty well in the comment section. It is the assembly you're debugging, but the debugger tries to do a match. With various results. — klutt, Jul 06 '21 at 10:05
@daniel 2) The point is, that if you have a bug that invokes undefined behavior, it's fairly common that the code behaves differently when enabling optimization, which was the case here. In real software those bugs exists too, but can be tricky to find. And if it works by reducing optimization one step, why bother? — klutt, Jul 06 '21 at 10:07
If clang or gcc `-Os` ever makes a bigger binary than `-O0`, that's a bug. `-O0` is so bloated with redundant load/store instructions that it's always pretty large. All modern mainstream compilers have an option to optimize for size, `-Os` in ones that follow GCC-style options, or I think `-O1` or `/O1` for MSVC. And this question is about GCC specifically. If someone is compiling anti-optimized `-O0` builds for code-size, they're completely shooting themselves in the foot, so I don't think that counts as a valid reason. (clang even has `-Oz` to opt. for size without caring about speed) — Peter Cordes, Jul 06 '21 at 13:04

Clifford · Answer 2 · 2021-07-05T19:29:41.010

If you never use a source level debugger you probably could. But if you never use a source level debugger, you probably should.

Unoptimized code has a direct one-to-one correspondence to statements expressions and variables in the source code, so when stepping through the code, it all makes sense - all the lines are executed in the order you would expect and all variables have a valid state when you would expect them to do so.

Optimised code on the other hand can eliminate code and variables, and reorder execution and generally render source level debugging a nonsense. Sometimes you get a bug that only appears in an optimised build, so you may have to deal with it, but generally such things are a result of undefined behaviour, and it is generally better to avoid that in the first instance.

One thing to consider is that in development you have performed all your testing and development on unoptimized code; so you could debug it. If, on the day you release it you crank up the optimiser and ship it, you are essentially shipping a whole lot of untested code. Testing is hard, and you really should test what you release, so between building and releasing you may have a lot of work to do to eliminate the risk. Releasing to the same build spec that you have been testing every day throughout development may be lower risk.

For code running on a desktop responding to and waiting for user input, or which disk or network I/O bound, making the code faster or smaller often serves little purpose. There may be specific parts of a large application that will benefit such as sorting or searching algorithms on large data sets, or image or audio processing, and for those you might use targeted rather then whole application optimisation.

In embedded systems where often you are using processors much slower than Desktop systems with much smaller memory resources, optimisation for both speed and size may be critical, but even there the code normally has to both fit and meet real-time deadlines in its debug build in order to support test and debugging. If it only works optimised, it will be much harder to debug.

Apart from optimising your code, it should perhaps be noted that in order to do that job, the optimiser has to perform a much deeper analysis the code through techniques such as abstract execution, and in doing so can find bugs and issue warnings that normal compilation will not detect. For example the optimiser is rather good at detecting variables that may be used before they are initialised. To that end, I would recommend switching on and max'ing the optimiser as a kind of "poor man's" static analysis, even if you use a lower optimisation level for release - for the reasons given earlier.

The optimiser is also the most complex part of any compiler; if the compiler is going to have a bug, it is likely to be in the optimiser. That said I have only ever encountered one such confirmed bug, in Microsoft C v6.0 1989! More often what at first appears to be a compiler bug turns out to be undefined behaviour or latent bugs in the source being compiled that manifest themselves with different code generation options.

Fortunately modern compilers have things like `gcc -fsanitize=undefined`. Certainly not perfect; there are some kinds of UB it won't catch, though, so it's not a free pass. Re: code-gen differences, there's a canonical Q&A for that: [Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?](https://stackoverflow.com/q/53366394) — Peter Cordes, Jul 05 '21 at 21:40

dmuir · Answer 3 · 2021-07-06T08:32:36.343

3

Personally I usually have optimisation turned on.

My reasons are:

The shipped code is built with optimisation as we need the -- especially numerical -- performance. Since you can't ship what you haven't tested the test version must also be optimised. It would I suppose be possible to build without optimisation during development but I begrudge the extra time to then build with optimisation, and test, prior to release to test. Moreover performance is sometimes part of the spec, so some development test has to be done with optimised code.

I don't find using a debugger so very tough with optimised code. Mind you, given the kind of programs I mostly write -- fancy filters without user interfaces and numerical libraries -- printf and valgrind (which works fine with optimised code) are my preferred tools.

In recent versions of gcc, at least, more and better diagnostics are produced with optimisation on rather than off.

This, like so much else in programming, will of course vary with circumstances.

edited Jul 06 '21 at 08:32

answered Jul 05 '21 at 20:23

dmuir

4,211
2
14
12

Also note that optimization isn't a binary choice. GCC and clang have `-Og` and `-O1` for "light" optimization that doesn't take much extra compile time, but will do register allocation instead of always spilling everything (except `register` variables) to their stack slots after every statement. (Which is [somewhat necessary](https://stackoverflow.com/questions/53366394/why-does-clang-produce-inefficient-asm-with-o0-for-this-simple-floating-point) to support `jump` to another source line in GDB, or modification of any non-const variable while stopped at any breakpoint) – Peter Cordes Jul 05 '21 at 21:46
1

`-Og` is explicitly intended for compile/test/edit cycles when you don't want as much slowness as the default `-O0`, and hopefully not hurting debuggability too much. Certainly less than `-O3 -march=native -flto` auto-vectorization with LTO cross-file inlining. – Peter Cordes Jul 05 '21 at 21:47

Steve Summit · Answer 4 · 2021-07-06T12:14:47.440

One reason is probably just: tradition. The first C compiler was written for the DEC PDP-11, which had a 64k address space. (That's right, a tenth of that famous but mythical old IBM PC quote about "640k should be enough for anybody".) The first C compiler ran as quite a number of separate programs or passes: there was the preprocessor cpp, the parser c0, the code generator c1, the assembler as, and the linker ld. If you asked for optimization, it ran as a separate pass c2 which was a "peephole optimizer" operating on c1's output, before passing it to as.

Compilation was much slower in those days than it is today (because of course the processors were much slower). People didn't routinely request optimization for everyday work, because it really did cost you something significant in your edit/compile/debug cycle.

And although a whole lot has changed since then, the fact that optimization is something extra, something special, that you have to request explicitly, lives on.