2

I have small program that performs much better when compiled with -O1 as opposed to no optimisation. I am interested in knowing what optimisation(s) done by the compiler is leading to this speedup.

What I thought I would do is to take the list of optimisation flags that -O1 is equivalent to (got both from the man page and from gcc -Q -v) and then to pick away at the list to see how the performance changes.

What I have found is that even including the whole list of optimisations still does not give me a program that performs as well as an -O1 optimised one.

In other words

gcc -O0 -fcprop-registers -fdefer-pop -fforward-propagate -fguess-branch-probability \
    -fif-conversion -fif-conversion2 -finline -fipa-pure-const -fipa-reference \
    -fmerge-constants -fsplit-wide-types -ftoplevel-reorder -ftree-ccp -ftree-ch \
    -ftree-copy-prop -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse \
    -ftree-fre -ftree-sink -ftree-sra -ftree-ter myprogram.c

is not the same as

gcc -O1 myprogram.c

I am using gcc version 4.5.3

Is there something else that -O1 does that isn't included in the list of optimisation flags associated with -O1 in the manual?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • How does your program fare under `-O2` and `-O3`? – Jonathan Leffler Oct 07 '12 at 16:20
  • For the different optimisation levels 0 to 3 the timings are 31,16,14 and 19 seconds. – Mark Wassell Oct 08 '12 at 20:38
  • Interesting. I'm inclined to think that simply going with -O2 is going to be simpler than trying to decompose the tweaks, especially since it seems to be hard to determine which tweaks are really present in each level of optimization specified by -On. It depends on how crucial those 2 seconds are. Clearly, 31 to 14-16 seconds is a 50% decrease in time or 100% increase in speed; well worth having. But how much to fret about the difference between 14 and 16 seconds depends on your larger context. If it will be run once a month, it doesn't matter; if run a few times a minute, it's more crucial. – Jonathan Leffler Oct 08 '12 at 21:37

2 Answers2

2

How about using -S option to check the produced assembler?

From two experiments using also "my_program.c" it seems, that -O0 option disables all optimizations regardless of the long list of suggested algorithms.

Aki Suihkonen
  • 19,144
  • 1
  • 36
  • 57
1

This is expected, not a bug: https://gcc.gnu.org/wiki/FAQ#optimization-options

Is there something else that -O1 does that isn't included in the list of optimisation flags associated with -O1 in the manual?

Yes, it turns on optimization. Specifying individual -fxxx flags doesn't do that.

If you don't use one of the -O1, -O2, -O3, -Ofast, or -Og optimization options (and not -O0) then no optimization happens at all, so adjusting which optimization passes are active doesn't do anything.

To find which optimization pass makes the difference you can turn on -O1 and then disable individual optimization passes until you find the one that makes a difference.

i.e. instead of:

gcc -fxxx -fyyy -fzzz ...

Use:

gcc -O1 -fno-xxx -fno-yyy -fno-zzz ...
Jonathan Wakely
  • 166,810
  • 27
  • 341
  • 521
  • But removing all the O1 optimizations from O1 doesn't nearly get close to O0 level. The code is still fairly optimized (judging by the code size and local variable visibility during debugging). – Radzor Apr 23 '23 at 22:01
  • If you want no optimization, don't use -O1. If you want to find which optimization pass affects the performance (which is what the question is about), use -O1 and disable individual passes. – Jonathan Wakely Apr 25 '23 at 09:46