7

I have heard that one should not compile with -O3 option with gcc. Is that true? If so, what are the reasons for avoiding -O3?

user877329
  • 6,717
  • 8
  • 46
  • 88
  • 1
    It's not really *harmfull*, but it might not make your code faster. Sometimes it's better to use `-O2`, and possibly add some individual optimizations from `-O3`. – Some programmer dude Aug 08 '14 at 07:23
  • _"I have heard"_. From where? And did that person provide any arguments in favor of his/her advice? – Michael Aug 08 '14 at 07:23
  • @Michael, The person did not provide an arguments. If so, I wouldn't have to ask. – user877329 Aug 08 '14 at 07:38
  • 3
    possible duplicate of [Is optimisation level -O3 dangerous in g++?](http://stackoverflow.com/questions/11546075/is-optimisation-level-o3-dangerous-in-g) – Michael Aug 08 '14 at 07:44
  • @Michael It is. Why didn't I find this question when searching? – user877329 Aug 08 '14 at 09:00

1 Answers1

9

The answer is: it depends on your code.

The basic rule of thumb is like this:

  • At -O1 the compiler does optimizations that don't take too long to compute.

  • At -O2 the compiler does "expensive" optimizations that may slow the compile process. They might also make the output program a little larger, but probably not so much.

  • -Os is roughly the same as -O2, but the optimizations are tuned more towards size than speed. For the most part these two features don't conflict (more optimal code does less steps and is therefore smaller), but there are some tricks that duplicate code to avoid branching penalties, for example.

  • At -O3 the compiler really cranks up the space-hungry optimizations. It will inline functions much more aggressively, and try to use vectorization where possible.

You can read more details in the GCC documentation. If you really want to super optimize your code then you can try to enable even more options not used even at -O3; the -floop-* options, for instance`.

The problem with speed-space optimizations, in particular, is that they can have a negative impact on the effectiveness of your memory caches. The code might be better for the CPU, but if it's not better for your memory, then you lose. For this reason, if your program doesn't have a single hot-spot where it spends all it's time then you might find it is slowed down overall.

Real-world optimization is a imprecise science for three reasons:

  1. User's hardware varies a lot.

  2. What's good for one code base might not be good for another.

  3. We want the compiler to run quickly, so it must make best guesses, rather than trying all the options and picking the best.

Basically, the answer is always, if performance matters try all the optimization levels, measure how well your code performs, and choose the best one for you. And do this again every something big changes.

If performance does not matter, -O2 is the choice for you.

ams
  • 24,923
  • 4
  • 54
  • 75
  • It might be useful to also pass `-mtune=native` (with `-O2` or `-O3`) – Basile Starynkevitch Aug 08 '14 at 09:37
  • Good point, although that option is not supported for all architectures. – ams Aug 08 '14 at 09:43
  • @ams (I guess you are not AMS) Optimization _is_ a precise science: The problem is formulated "Given function f(x), find x_0 such that for all x f(x_0) – user877329 Aug 08 '14 at 09:48
  • @user877329 I added "Real-world" now. Is that better? In general, perfect optimization is not achievable for all but the most basic programs. It's an unsolvable problem, even when the conditions are known and the goals are well defined. I've not proven that mathematically, but I have no wish to wait for the computer to try it. – ams Aug 08 '14 at 09:54
  • @ams Yes, as noted after phrasing the optimization problem I listed all sorts of things making the life difficult. – user877329 Aug 09 '14 at 13:52