What's wrong with -O3 (optimization level 3)?

Question

I noticed that in QT Creator the default optimization level for the release version is -O2. I was wondering: why not -O3? I've read here on Stack Overflow that it can be dangerous or "bug exposing", but what are those optimizations flag which are considered to be more riskful than helpful and why?

Optimization level 3 flags (on GCC):

-fgcse-after-reload
-finline-functions
-fipa-cp-clone
-fpredictive-commoning
-ftree-vectorize
-funswitch-loops

It's more like every one used O2, so it is better tested. There is no sepcific thing, that causes O3 code compiled to be buggy. It's more like things are most of things are testes on O2, so it's less likely you will get bug, when using O2 optimizations. O3 should not generate bugs (with assumption, that there is no bug in compiler) — DawidPi, Mar 07 '16 at 10:07
@DawidPi, If that was the case, why wouldn't O3 not be the level everyone used if it usually delivered better or equal performance? I think we will learn something from this question. If nobody else gets there first, I will do some research into this. — merlin2011, Mar 07 '16 at 10:10
http://stackoverflow.com/questions/5637828/why-would-one-ever-want-to-compile-with-o2-instead-of-o3 there is something about it — DawidPi, Mar 07 '16 at 10:14
@Claudio, This question seems to asking about a different type of harm (correctness bugs) than the accepted answer to the duplicate question addresses (possible performance degradation due to poor caching). — merlin2011, Mar 07 '16 at 10:23
@merlin2011 O3 is the level I use for production ready code, there is no reason not to use the options that give you the best performances. — ouah, Mar 07 '16 at 10:27
@ouah, Would you assert then that `-O2` and `-O3` have equal risk of correctness bugs introduced by the compiler? I'm asking because I actually do not know without further research, but I hope you do. :) — merlin2011, Mar 07 '16 at 10:29
If one of the technique is to have more aggressive inlining there could be some degradation due to registers availability and consequent inefficient instruction caching. But not necessarily derived bugs. — Frankie_C, Mar 07 '16 at 10:35
@merlin2011 there is no gcc option that (except if there is a compiler bug) that can introduce bugs on a conforming program. Moreover the compiler optimizations that are more prone to give you issues if you program is not conforming (like -fstrict-aliasing for example) are enabled below `-O3`. — ouah, Mar 07 '16 at 10:35
By the way in C (gcc 5.3), these are not the only options enabled by -O3 compare to -O2, there are also: -ftree-loop-distribute-patterns, -ftree-loop-vectorize, -ftree-partial-pre, -ftree-slp-vectorize and -fvect-cost-model. (Use gcc `-Q --help=optimizers` option to find out the differences) — ouah, Mar 07 '16 at 10:40

Yam Marcovic · Answer 1 · 2016-03-07T10:54:45.387

Aside from compiler bugs, this is probably a myth. It's the -Ofast option that's risky, because that one doesn't even guarantee that standards-complying programs will not break.

Just as a practical example, here's a quick search of libraries, in no particular order, inside the Android Open Source Project (AOSP)—which is probably enough of a "real production codebase"—that use -O3:

cblas, openssh, libmpeg2, libavc, lvvm: MCJIT, jpeg, zlib, lz4, regex-re2, libpng, libutf, (and more)

Other code in the AOSP simply tries to optimize for size (still, Android), so it explicitly uses -Os. But much of that code still uses all these libraries—for which performance is a bigger consideration than size. Notice, too, that correctness is probably a big issue especially for the likes of openssh, which is incidentally mentioned above.

Keep in mind that programmers always tend to get more suspicious of optimized code. When you're not set on writing standards-compliant code, and delve into the territory of undefined & unspecified behavior, there's theoretically nothing to prevent the compiler from generating different results between different configurations (such as optimization level).

So when you mostly do your work on one particular level (debug) you tend to take that as your basic point of reference, and then, when you switch, it's easy to simply blame the optimizer—which, as far as it's concerned, might do a whole lot to remain standards-compliant, unlike our own code.

score 3 · Answer 2 · answered Mar 07 '16 at 10:42

A summary of the excellent links in the comments: -O2 is often preferred over higher optimizations levels due to:

smaller generated code size (often equates to better performance because of processor caches for branch prediction)
less compilation time
bugs (in your code) which are more prone to be exposed in a higher optimization level.

Optimizer bugs are not unheard of, but most times the real cause is undefined behavior in the source code. On the other hand i remember a 15+ years old (commercial) compiler, which was riddled with optimization bugs and in one instance even managed to "fix" a faulty program.

About the speed difference between Optimization levels: Documentation for SunCC read something like "-O4 is in general faster than -O3 and -O3 is in general faster than -O2. But sometimes -O2 beats all others.". By my experience, -O2 was often faster, not just sometimes.

But when you're talking about bugs, what's the difference between levels? Does it say anywhere it the GCC docs that "Optimizations that are more likely to be buggy go into higher numeric levels?" — Yam Marcovic, Mar 07 '16 at 10:44
@YamMarcovic Nope, of course not. Optimizer bugs (which cause other than the intended behavior) are compiler bugs and should be fixed regardless of level. Source code with undefined behavior or `--ffast-math`... these are no compiler bugs. — Markus Kull, Mar 07 '16 at 10:51

What's wrong with -O3 (optimization level 3)?

2 Answers2