Can c++ compiler optimization generate two versions of a function with respect to a boolean argument?

Question

I have a function f(int x, float y, char* z, .., bool b). The argument b is used only in the fashion of:

if (b) {
 ...
} else {
 ...
}

in various parts of the function body. For efficiency reason, I would like to effectively create two functions f0 and f1, where b is set to false and true respectively, to avoid evaluating the conditional at run time. At the same time since the implementation of f is fairly long, I don't want to explicitly define f0 and f1 separately. Is there any compiler optimization feature that automatically spawns these two branch functions at compile time?

Maybe there are better design patterns that avoid this line of thinking completely? Keep in mind that the conditional b can be evaluated in a massive loop.

Yes, this is possible. [It's called _function cloning_.](https://stackoverflow.com/questions/15439890/influencing-function-cloning-duplication-constant-propagation-in-gcc) — Iwillnotexist Idonotexist, Dec 28 '18 at 06:16
Can you sort you "massive loop" so there is as small number of true/false switches as possible? If not yet - try that and profile before/after - there is a good chance you'll gain your performance by just making CPU prediction happier. — Alexei Levenkov, Dec 28 '18 at 06:17
You should not care about the functions at the machine code level. Think in terms of source code. Trust your compiler for such optimizations (which might not be worthwhile) — Basile Starynkevitch, Dec 28 '18 at 06:51
@BasileStarynkevitch completely agree ... https://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array assuming OP uses decent compiler, if not - worth a try. At very least they may try to profile code :) — Alexei Levenkov, Dec 28 '18 at 06:54
Why don't you want to explicitly define f0 and f1 explicitly? Using length of the function as a justification seems more like laziness (not wanting to tease the two parts of the function apart) than anything else, since both f0 and f1 will be obtained by removing code from the original function, not adding new code. — Peter, Dec 28 '18 at 07:02
@BasileStarynkevitch Yes in many cases I can avoid the pathological request, but it does arise at times when the boolean gets used many times in a loop. I am working on a very basic piece of code that gets used repeatedly, so optimization does seem worthwhile and not premature. — John Jiang, Dec 28 '18 at 07:12
Again, trust your compiler. Perhaps, choose a better compiler, or just increase the optimization levels. Your developer's time costs more than the nanoseconds you are trying to win. And you could be wrong in believing that your transformation is always more efficient. When `f` is inlined, it does not matter at all — Basile Starynkevitch, Dec 28 '18 at 07:13
@IwillnotexistIdonotexist Thanks! That looks exactly what's needed. Would be nice to have compiler flag that indicates whether cloning failed. — John Jiang, Dec 28 '18 at 07:15
@BasileStarynkevitch Advice heeded. It's more of a curiosity thing, as I ran into this several times in the past. — John Jiang, Dec 28 '18 at 07:18

Sid S · Answer 1 · 2018-12-28T06:57:38.557

2

Use a template:

template<bool b>
void f(int x, float y, char *z)
{
    if (b)
    {
        ...
    }
    else
    {
        ...
    }
}

...
if (runtimeCondition)
{
    f<true>(1, 2, "");
}
else
{
    f<false>(1, 2, "");
}

edited Dec 28 '18 at 06:57

answered Dec 28 '18 at 06:18

Sid S

6,037
2
18
24

Templates don't work for parameters known only at runtime. – Mark Ransom Dec 28 '18 at 06:43
@SidS what if the if else gets called a million times in f? Templating would still require evaluating them all, right? – John Jiang Dec 28 '18 at 07:14
@user2930156, If you're using a modern compiler, there is no reason for the test being carried out more than once. The template will generate two functions, one for the case where `b` is `true` and one for the case where `b` is `false`. – Sid S Dec 28 '18 at 07:24
@SidS very interesting. Thanks for the info! – John Jiang Dec 28 '18 at 07:24
It is not guaranty that condition is not evaluated at runtime though, even if correct compilers should avoid the branching. – Jarod42 Dec 28 '18 at 09:20

Jarod42 · Accepted Answer · 2018-12-28T18:54:21.110

Template allows to factorize this kind of code, and C++17 allows it cleanly with if constexpr:

template<bool b>
void f(int x, float y, char *z)
{
    // ...
    if constexpr (b)
    {
        // ...
    }
    else
    {
        // ...
    }
    // ...
}

// if you still need runtime dispatch
void f(int x, float y, char *z, bool b)
{
    return b ? f<true>(x, y, z) : f<false>(x, y, z);
}

Without if constexpr, it is not guaranty that there are no branch at runtime, but compiler might easily do it normally. So if you want that guaranty pre-C++17, you have to specialize divergent part

by specialization:

template <bool b> void f_impl(..);

template <> void f_impl<true>(..)  { /*..*/ }
template <> void f_impl<false>(..) { /*..*/ }

template<bool b>
void f(int x, float y, char *z)
{
    // ...
    f_impl<b>(..);
    // ...
}

or tag dispatching:

void f_impl(std::true_type, ..)  { /*..*/ }
void f_impl(std::false_type, ..) { /*..*/ }

template<bool b>
void f(int x, float y, char *z)
{
    // ...
    f_impl(std::integral_constant<bool, b>{}..); // std::bool_constant<b> in C++17 
    // ...
}

Can c++ compiler optimization generate two versions of a function with respect to a boolean argument?

2 Answers2