For my compiler ( clang++ v2.9 on OS X ) compiling this similar but not identical code:
void foo();
void bar();
template<int N>
void do_something( int arg ) {
if ( N<0 && arg<0 ) { foo(); }
else { bar(); }
}
// Some functions to instantiate the templates.
void one_fn(int arg) {
do_something<1>(arg);
}
void neg_one_fn(int arg) {
do_something<-1>(arg);
}
This generates the following assembly with clang++ -S -O3
.
one_fn = do_something<1>
The first functions assembly clearly only has the call to bar
.
.globl __Z6one_fni
.align 4, 0x90
__Z6one_fni: ## @_Z6one_fni
Leh_func_begin0:
pushl %ebp
movl %esp, %ebp
popl %ebp
jmp __Z3barv ## TAILCALL
Leh_func_end0:
neg_one_fn = do_something<-1>
The second function has been reduced to a simple if to call either bar
or foo
.
.globl __Z10neg_one_fni
.align 4, 0x90
__Z10neg_one_fni: ## @_Z10neg_one_fni
Leh_func_begin1:
pushl %ebp
movl %esp, %ebp
cmpl $0, 8(%ebp)
jns LBB1_2 ## %if.else.i
popl %ebp
jmp __Z3foov ## TAILCALL
LBB1_2: ## %if.else.i
popl %ebp
jmp __Z3barv ## TAILCALL
Leh_func_end1:
Summary
So you can see that the compiler inlined the template, then optimised away the branch when it could. So the kind of transformation you are hoping for does occur in current compilers. I got similar results (but less clear assembly) from an old g++ 4.0.1 compiler too.
Addendum:
I decided this example wasn't quite similar enough to your initial case (as it didnt' involve the ternary operator) so I changed it to this: (Getting the same kind of results)
template<int X>
void do_something_else( int _ncx ) {
static const int _cx = (X<0) ? (-X) : (X);
if ( (X < 0) ? (_cx > 5) : (_ncx > 5)) {
foo();
} else {
bar();
}
}
void a(int arg) {
do_something_else<1>(arg);
}
void b(int arg) {
do_something_else<-1>(arg);
}
This generates the assembly
a() = do_something_else<1>
This still contains the branch.
__Z1ai: ## @_Z1ai
Leh_func_begin2:
pushl %ebp
movl %esp, %ebp
cmpl $6, 8(%ebp)
jl LBB2_2 ## %if.then.i
popl %ebp
jmp __Z3foov ## TAILCALL
LBB2_2: ## %if.else.i
popl %ebp
jmp __Z3barv ## TAILCALL
Leh_func_end2:
b() = do_something_else<-1>
Branch is optimised away.
__Z1bi: ## @_Z1bi
Leh_func_begin3:
pushl %ebp
movl %esp, %ebp
popl %ebp
jmp __Z3barv ## TAILCALL
Leh_func_end3: