Given the following two functions:
int f() { return 0; }
int g() { return 1; }
And the following code to invoke one of them depending on a boolean b
:
int t0(bool b) { return (b ? &f : &g)(); }
int t1(bool b) { return b ? f() : g(); }
int t2(bool b) { return b ? t0(true) : t0(false); }
Both g++ (trunk)
and clang++ (trunk)
with -std=c++2a -Ofast -march=native
fail to optimize the following code:
int main(int ac, char**) { return t0(ac & 1); }
Producing the following assembly:
main: and edi, 1 mov eax, OFFSET FLAT:f() mov edx, OFFSET FLAT:g() cmove rax, rdx jmp rax
Invoking t1
or t2
(instead of t0
) produces the following optimized assembly:
main: mov eax, edi not eax and eax, 1 ret
Everything can be reproduced live on gcc.godbolt.org.
I find it puzzling that invoking t0
directly in main
doesn't get optimized, while invoking it in t2
is OK.
Is there a reason why invoking t0
does not produce the same assembly as t1
or t2
? Or is this a missed optimization opportunity by both g++ (trunk)
and clang++ (trunk)
?