Consider the following example:
static int sum(int x)
{
int s = 0;
for (j = 1; j <= x; ++j)
s += j;
return s;
}
int main() { return sum(10); }
When you compile this code in clang using -O1
, the sum
function (if it weren't static) is re-implemented by the compiler using the arithmetic progression formula f(x)=x(x+1)/2
(actually, g(x)=(x-1)(x-2)/2 +2x-1
that mathematically is the same but healps dealing with overflow issues), and main
is re-implemented by directly returning 55 (and them sum
is removed).
NOTE: I'm using godbolt for all of my tests, using x86-64 clang (trunk)
.
All of this, I understand, happens because of the following set of optimization techniques all working together:
- The compiler performs a scalar evolution analysis of
sum
. It detects that the value ofs
in thex
-th iteration isg(x)
, so the computation ofs
is moved outside the loop. - Since now the loop is empty, it's removed.
- Since now the function has shrank in size, it's candidate for inlining, and the
main
's call tosum
is replaced by the body of the function. - The replaced expression depends solely on a constant, and so the expression is evaluated in compilation-time, giving
55
(and sincesum
is now a static function where no one is calling it, it's removed, probably by the linker I guess).
I would like to know which specific compilation flags (besides any -Ox
) are related to each one of the four points.
- Scalar evolution, but I haven't found which
-fsomething
compilation flag enables it. - I guess it's related with dead code elimination?
- Inline analysis. I have tried to implement the program as:
inline int sum(int x) { return x*(x-1)/2; }
int main() { return sum(10); }
and then activate the flag -finline
(instead of -O1
), but main
keeps calling sum
, so no inlining is being performed. I would like to know what should I do (which specific compiler flags to activate) to make the body replacement happen.
- Constant folding? I know that if I implement
main
as:
int main()
{
int const x = 10;
return x * (x + 1) / 2;
}
the constaint folding applies even with -O0
and no other flag active, but if I remove const
from x
, then it doesn't (with -O1
does). Which flag forces a constant folding even if x
is not a constant?
I know that this feels as different questions, but it's actually one: what is the minimal set of specific compiler flags required to make the compiler give the same output for the original program as if -O1
were active?