What GCC optimization flags and techniques are safe across CPUs?

Question

When compiling/linking C/C++ libraries or programs that are meant to work on all implementations of an ISA (e.g. x86-64), what optimization flags are safe from the correctness and run-time performance perspectives? I want optimizations that yield correct results and won't be detrimental performance-wise for a particular CPU. E.g I would like to avoid optimization flags that yield run-time performance improvements on an 8th-gen Intel Core i7, but result in performance degradation on an AMD Ryzen.

Are PGO, LTO, and -O3 safe? Is it solely dependent on -march and -mtune (or the absence thereof)?

[Are compiler optimizations safe?](https://stackoverflow.com/q/9059265/995714), [When can I confidently compile program with -O3?](https://stackoverflow.com/q/14850593/995714), [Safety-critical software and optimising compilers](https://softwareengineering.stackexchange.com/q/267277/98103), [Can compiler optimization introduce bugs?](https://stackoverflow.com/q/2722302/995714). All `-Ox` optimizations are safe. If they produce different results then either there are bugs/UB in your code, or it's a compiler bug — phuclv, Sep 21 '18 at 05:20
[The risks of using PGO (profile-guided optimization) with production environment](https://stackoverflow.com/q/12776845/995714), [Is there any reason why not to use link time optimization?](https://stackoverflow.com/q/23736507/995714) — phuclv, Sep 21 '18 at 05:24

ams · Accepted Answer · 2018-09-27T12:27:17.087

They're all supposed to be "safe", assuming that your code is well defined.

If you don't want to specialize for a particular CPU family then just leave -march and -mtune alone; the default suits a generic x86_64.

PGO is always a good idea, it's mostly used for avoiding branches.

LTO and -O3 can have different effects on different code-bases. For example, if your code benefits from vectorization then -O3 is a big win over -O2, but the extra inlining and unrolling can lead to larger code sizes, and that can be a disadvantage on systems with more limited caches.

In the end, the only advice that ever really means anything here is: measure it and see what's good for your code.

What GCC optimization flags and techniques are safe across CPUs?

1 Answers1

Linked