0

I'm measuring how optimization flag number affects the coverage of GCC when it is compiling some test cases. I know GCC has predefined optimization levels, such as -O1 and -O2. I also know how to obtain the list of concrete flags enabled by a predefined level by this command:

$ gcc -O2 -Q --help=optimizer

The following options control optimizations:
  -O<number>
  -Ofast
  -Og
  -Os
  -faggressive-loop-optimizations   [enabled]
  -falign-functions                 [enabled]
  -falign-functions=
  -falign-jumps                     [enabled]
  -falign-jumps=
  -falign-labels                    [enabled]
  -falign-labels=
  -falign-loops                     [enabled]
  -falign-loops=
  -fallocation-dce                  [enabled]
  -fallow-store-data-races          [disabled]
  -fassociative-math                [disabled]
  -fassume-phsa                     [available in BRIG]
  -fasynchronous-unwind-tables      [enabled]
  -fauto-inc-dec                    [enabled]
  -fbranch-count-reg                [enabled]
  -fbranch-probabilities            [disabled]
  -fcaller-saves                    [enabled]
  -fcode-hoisting                   [enabled]
  -fcombine-stack-adjustments       [enabled]
  -fcompare-elim                    [enabled]
  -fconserve-stack                  [disabled]
  -fcprop-registers                 [enabled]
  -fcrossjumping                    [enabled]
  -fcse-follow-jumps                [enabled]
  -fcx-fortran-rules                [disabled]
  -fcx-limited-range                [disabled]
  -fdce                             [enabled]
  -fdefer-pop                       [enabled]
  -fdelayed-branch                  [disabled]
  -fdelete-dead-exceptions          [disabled]
  -fdelete-null-pointer-checks      [enabled]
  -fdevirtualize                    [enabled]
  -fdevirtualize-speculatively      [enabled]
  -fdse                             [enabled]
  -fearly-inlining                  [enabled]
  -fexceptions                      [disabled]
  -fexcess-precision=[fast|standard]    [default]
  -fexpensive-optimizations         [enabled]
  -ffast-math
  -ffinite-loops                    [disabled]
  -ffinite-math-only                [disabled]
  -ffloat-store                     [disabled]
  -fforward-propagate               [enabled]
  -ffp-contract=[off|on|fast]       fast
  -ffp-int-builtin-inexact          [enabled]
  -ffunction-cse                    [enabled]
  -fgcse                            [enabled]
  -fgcse-after-reload               [disabled]
  -fgcse-las                        [disabled]
  -fgcse-lm                         [enabled]
  -fgcse-sm                         [disabled]
  -fgraphite                        [disabled]
  -fgraphite-identity               [disabled]
  -fguess-branch-probability        [enabled]
  -fhandle-exceptions               -fexceptions
  -fhoist-adjacent-loads            [enabled]
  -fif-conversion                   [enabled]
  -fif-conversion2                  [enabled]
  -findirect-inlining               [enabled]
  -finline                          [enabled]
  -finline-atomics                  [enabled]
  -finline-functions                [enabled]
  -finline-functions-called-once    [enabled]
  -finline-small-functions          [enabled]
  -fipa-bit-cp                      [enabled]
  -fipa-cp                          [enabled]
  -fipa-cp-clone                    [disabled]
  -fipa-icf                         [enabled]
  -fipa-icf-functions               [enabled]
  -fipa-icf-variables               [enabled]
  -fipa-profile                     [enabled]
  -fipa-pta                         [disabled]
  -fipa-pure-const                  [enabled]
  -fipa-ra                          [enabled]
  -fipa-reference                   [enabled]
  -fipa-reference-addressable       [enabled]
  -fipa-sra                         [enabled]
  -fipa-stack-alignment             [enabled]
  -fipa-vrp                         [enabled]
  -fira-algorithm=[CB|priority]     CB
  -fira-hoist-pressure              [enabled]
  -fira-loop-pressure               [disabled]
  -fira-region=[one|all|mixed]      [default]
  -fira-share-save-slots            [enabled]
  -fira-share-spill-slots           [enabled]
  -fisolate-erroneous-paths-attribute   [disabled]
  -fisolate-erroneous-paths-dereference     [enabled]
  -fivopts                          [enabled]
  -fjump-tables                     [enabled]
  -fkeep-gc-roots-live              [disabled]
  -flifetime-dse                    [enabled]
  -flifetime-dse=<0,2>              2
  -flimit-function-alignment        [disabled]
  -flive-patching                   -flive-
                              patching=inline-clone
  -flive-patching=[inline-only-static|inline-clone]     [default]
  -flive-range-shrinkage            [disabled]
  -floop-interchange                [disabled]
  -floop-nest-optimize              [disabled]
  -floop-parallelize-all            [disabled]
  -floop-unroll-and-jam             [disabled]
  -flra-remat                       [enabled]
  -fmath-errno                      [enabled]
  -fmodulo-sched                    [disabled]
  -fmodulo-sched-allow-regmoves     [disabled]
  -fmove-loop-invariants            [enabled]
  -fnon-call-exceptions             [disabled]
  -fnothrow-opt                     [available in C++,
                              ObjC++]
  -fomit-frame-pointer              [enabled]
  -fopt-info                        [disabled]
  -foptimize-sibling-calls          [enabled]
  -foptimize-strlen                 [enabled]
  -fpack-struct                     [disabled]
  -fpack-struct=<number>
  -fpartial-inlining                [enabled]
  -fpatchable-function-entry=
  -fpeel-loops                      [disabled]
  -fpeephole                        [enabled]
  -fpeephole2                       [enabled]
  -fplt                             [enabled]
  -fpredictive-commoning            [disabled]
  -fprefetch-loop-arrays            [enabled]
  -fprintf-return-value             [enabled]
  -fprofile-partial-training        [disabled]
  -fprofile-reorder-functions       [disabled]
  -freciprocal-math                 [disabled]
  -free                             [enabled]
  -freg-struct-return               [disabled]
  -frename-registers                [enabled]
  -freorder-blocks                  [enabled]
  -freorder-blocks-algorithm=[simple|stc]   stc
  -freorder-blocks-and-partition    [enabled]
  -freorder-functions               [enabled]
  -frerun-cse-after-loop            [enabled]
  -freschedule-modulo-scheduled-loops   [disabled]
  -frounding-math                   [disabled]
  -frtti                            [available in C++,
                              D, ObjC++]
  -fsave-optimization-record        [disabled]
  -fsched-critical-path-heuristic   [enabled]
  -fsched-dep-count-heuristic       [enabled]
  -fsched-group-heuristic           [enabled]
  -fsched-interblock                [enabled]
  -fsched-last-insn-heuristic       [enabled]
  -fsched-pressure                  [disabled]
  -fsched-rank-heuristic            [enabled]
  -fsched-spec                      [enabled]
  -fsched-spec-insn-heuristic       [enabled]
  -fsched-spec-load                 [disabled]
  -fsched-spec-load-dangerous       [disabled]
  -fsched-stalled-insns             [disabled]
  -fsched-stalled-insns-dep         [enabled]
  -fsched-stalled-insns-dep=<number>
  -fsched-stalled-insns=<number>
  -fsched2-use-superblocks          [disabled]
  -fschedule-fusion                 [enabled]
  -fschedule-insns                  [disabled]
  -fschedule-insns2                 [enabled]
  -fsection-anchors                 [disabled]
  -fsel-sched-pipelining            [disabled]
  -fsel-sched-pipelining-outer-loops    [disabled]
  -fsel-sched-reschedule-pipelined  [disabled]
  -fselective-scheduling            [disabled]
  -fselective-scheduling2           [disabled]
  -fshort-enums                     [enabled]
  -fshort-wchar                     [disabled]
  -fshrink-wrap                     [enabled]
  -fshrink-wrap-separate            [enabled]
  -fsignaling-nans                  [disabled]
  -fsigned-zeros                    [enabled]
  -fsimd-cost-model=[unlimited|dynamic|cheap]   unlimited
  -fsingle-precision-constant       [disabled]
  -fsplit-ivs-in-unroller           [enabled]
  -fsplit-loops                     [disabled]
  -fsplit-paths                     [disabled]
  -fsplit-wide-types                [enabled]
  -fsplit-wide-types-early          [disabled]
  -fssa-backprop                    [enabled]
  -fssa-phiopt                      [enabled]
  -fstack-check=[no|generic|specific]
  -fstack-clash-protection          [disabled]
  -fstack-protector                 [disabled]
  -fstack-protector-all             [disabled]
  -fstack-protector-explicit        [disabled]
  -fstack-protector-strong          [disabled]
  -fstack-reuse=[all|named_vars|none]   all
  -fstdarg-opt                      [enabled]
  -fstore-merging                   [enabled]
  -fstrict-aliasing                 [enabled]
  -fstrict-enums                    [available in C++,
                              ObjC++]
  -fstrict-volatile-bitfields       [enabled]
  -fthread-jumps                    [enabled]
  -fno-threadsafe-statics           [available in C++,
                              ObjC++]
  -ftoplevel-reorder                [enabled]
  -ftracer                          [disabled]
  -ftrapping-math                   [enabled]
  -ftrapv                           [disabled]
  -ftree-bit-ccp                    [enabled]
  -ftree-builtin-call-dce           [enabled]
  -ftree-ccp                        [enabled]
  -ftree-ch                         [enabled]
  -ftree-coalesce-vars              [enabled]
  -ftree-copy-prop                  [enabled]
  -ftree-cselim                     [enabled]
  -ftree-dce                        [enabled]
  -ftree-dominator-opts             [enabled]
  -ftree-dse                        [enabled]
  -ftree-forwprop                   [enabled]
  -ftree-fre                        [enabled]
  -ftree-loop-distribute-patterns   [enabled]
  -ftree-loop-distribution          [disabled]
  -ftree-loop-if-convert            [enabled]
  -ftree-loop-im                    [enabled]
  -ftree-loop-ivcanon               [enabled]
  -ftree-loop-optimize              [enabled]
  -ftree-loop-vectorize             [disabled]
  -ftree-lrs                        [disabled]
  -ftree-parallelize-loops=<number>     1
  -ftree-partial-pre                [disabled]
  -ftree-phiprop                    [enabled]
  -ftree-pre                        [enabled]
  -ftree-pta                        [enabled]
  -ftree-reassoc                    [enabled]
  -ftree-scev-cprop                 [enabled]
  -ftree-sink                       [enabled]
  -ftree-slp-vectorize              [disabled]
  -ftree-slsr                       [enabled]
  -ftree-sra                        [enabled]
  -ftree-switch-conversion          [enabled]
  -ftree-tail-merge                 [enabled]
  -ftree-ter                        [enabled]
  -ftree-vectorize
  -ftree-vrp                        [enabled]
  -funconstrained-commons           [disabled]
  -funroll-all-loops                [disabled]
  -funroll-loops                    [disabled]
  -funsafe-math-optimizations       [disabled]
  -funswitch-loops                  [disabled]
  -funwind-tables                   [disabled]
  -fvar-tracking                    [enabled]
  -fvar-tracking-assignments        [enabled]
  -fvar-tracking-assignments-toggle     [disabled]
  -fvar-tracking-uninit             [disabled]
  -fvariable-expansion-in-unroller  [disabled]
  -fvect-cost-model=[unlimited|dynamic|cheap]   cheap
  -fversion-loops-for-strides       [disabled]
  -fvpt                             [disabled]
  -fweb                             [enabled]
  -fwrapv                           [disabled]
  -fwrapv-pointer                   [disabled]

I compiled some test cases with -O2, it turns out that 440 files and 167,035 lines of GCC source code were covered. But when I concatenated every enabled flag into a string S and replace -O2 with S when compiling the same test cases, I found only 390 files and 90,385 lines were covered.

Can someone tell me why? I suspect there is some mechanism in GCC that makes S and -O2 behave very different.

潇洒张
  • 273
  • 2
  • 9
  • 1
    Enabling optimization or not is special, and not something you can do with any combination of `-f` options. Without any `-O` option, the `-O0` default still ensures consistent debugging, so can't / won't optimize across statements, not keeping values in registers or doing constant propagation except on actual `const` variables. There is no `-f` option for this. See Marc Glisse's comment on [Find out exact gcc implicit options](https://stackoverflow.com/posts/comments/63283211) which also quotes the intro part of https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html – Peter Cordes Nov 24 '22 at 03:37
  • 1
    See [Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?](https://stackoverflow.com/q/53366394) re: what `-O0` debug-mode code-gen looks and why. You could use `-O1` or `-Og` and a bunch of `-f` options, but it's still a problem that **Not all optimizations are controlled directly by a flag. Only optimizations that have a flag are listed in this section. [of the manual](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html)** – Peter Cordes Nov 24 '22 at 03:38
  • 1
    [GCC standard optimizations behavior](https://stackoverflow.com/q/33832997) has an answer about the fact that a set of `-f` options doesn't add up as equivalent to the `-O` that enables them. – Peter Cordes Nov 24 '22 at 03:59

0 Answers0