79

I'm currently working on a project and I have the following issue.

I have a C++ method that I want to work in two different ways :

void MyFunction()
{
  foo();
  bar();
  foobar();
}

void MyFunctionWithABonus()
{
  foo();
  bar();
  doBonusStuff();
  foobar();
}

And I would like not to duplicate my code because the actual function is much longer. The issue is I must not under any circumstance add execution time to the program when MyFunction is called instead of MyFunctionWithABonus. That is why I cannot just have a boolean parameter that I check with a C++ comparison.

My idea would have been to use C++ templates to virtually duplicate my code, but I can't think of a way of doing in which I don't have additional execution time and I don't have to duplicate the code.

I'm not an expert with templates so I may be missing something.

Does any of you have an idea? Or is that just impossible in C++11?

davnicwil
  • 28,487
  • 16
  • 107
  • 123
plougue
  • 571
  • 4
  • 7
  • 64
    May I ask *why* you cannot simply add a boolean check? If there is a lot of code in there, the overhead of a simple boolean check will be negligable. – Joris Apr 25 '17 at 08:47
  • The function is going to be called many many times in a context in which performance is very important. Basically the doBonusStuff() method will have to be called in a debugging context and the goal is not to add any runtime in a non-debugging context. – plougue Apr 25 '17 at 10:25
  • 39
    @plougue Branch prediction is very good nowadays, to the point that a boolean check often takes 0 processor cycles to execute. – Dan Apr 25 '17 at 11:24
  • 4
    Agree with @Dan . Branch prediction takes *almost* zero overhead these days especially if you are entering a particular branch a large number of times. – Akshay Arora Apr 25 '17 at 14:13
  • 6
    @Dan: A compare-and-branch is still at best one macro-fused uop (on modern Intel and AMD [x86](http://stackoverflow.com/tags/x86/info) CPUs), not zero. Depending on what the bottleneck is in your code, decoding / issuing / executing this uop could steal a cycle from something else, the same way an extra ADD instruction could. Also, just passing the boolean parameter, and having it tie up a register (or have to be spilled/reloaded) is a non-zero number of instructions. Hopefully this function inlines so the call and arg-passing overhead isn't there every time, and maybe cmp+branch, but still – Peter Cordes Apr 25 '17 at 17:13
  • If you're lucky, the compiler clones the loop and makes two versions for you, just like if you had done it in the source. But if you want to help the compiler see what you want it to do, you should do it in the source. Otherwise a future compiler version might not optimize the same way, and suddenly your code is slower. (You can sometimes get significant gains by hand-holding the compiler into generating better asm for basically the same logic, [on a micro-level](http://stackoverflow.com/a/40355466/224132) as well as stuff like cloning functions with template params instead of args.) – Peter Cordes Apr 25 '17 at 17:18
  • 15
    Did you write the code in the easy-to-maintain format first? Then did your profiler say that the branch was the bottleneck? Do you have data to suggest the time you're spending on this minor decision the best use of your time? – GManNickG Apr 25 '17 at 22:36
  • 3
    I'm very interested in hearing exactly how much that extra branch costs. – Sebastian Wahl Apr 26 '17 at 04:16
  • 1
    These look like they should be classes or structs... – takintoolong Apr 27 '17 at 03:44

6 Answers6

129

Something like that will do nicely:

template<bool bonus = false>
void MyFunction()
{
  foo();
  bar();
  if (bonus) { doBonusStuff(); }
  foobar();
}

Call it via:

MyFunction<true>();
MyFunction<false>();
MyFunction(); // Call myFunction with the false template by default

The "ugly" template can be all avoided by adding some nice wrappers to the functions:

void MyFunctionAlone() { MyFunction<false>(); }
void MyFunctionBonus() { MyFunction<true>(); }

You can find some nice informations on that technique there. That is an "old" paper, but the technique in itself stay totally right.

Provided you have access to a nice C++17 compiler you can even push further the technique, by using the constexpr if, like that:

template <int bonus>
auto MyFunction() {
  foo();
  bar();
  if      constexpr (bonus == 0) { doBonusStuff1(); }
  else if constexpr (bonus == 1) { doBonusStuff2(); }
  else if constexpr (bonus == 2) { doBonusStuff3(); }
  else if constexpr (bonus == 3) { doBonusStuff4(); }
  // Guarantee that this function will not compile
  // if a bonus different than 0,1,2,3 is passer
  else { static_assert(false);}, 
  foorbar();
}
  • 11
    And that check will be nicely optimized away by the compiler – Jonas Apr 25 '17 at 08:40
  • 22
    And [in C++17](https://wandbox.org/permlink/0NW2N0nvuZNKWOeG) `if constexpr (bonus) { doBonusStuff(); }`. – Chris Drew Apr 25 '17 at 08:41
  • @Gibet probably nothing but documentation in this case but I think that is still valuable. – Chris Drew Apr 25 '17 at 08:48
  • 13
    @Gibet: If the call to `doBonusStuff()` cannot even compile for some reason in the non-bonus case, it'll make a huge difference. – Lightness Races in Orbit Apr 25 '17 at 09:44
  • Can you do the same but use enums? They are much more readable most of the time because they convey meaning with their name (instead of 0) – WorldSEnder Apr 25 '17 at 09:51
  • Silly question but: Why do you need the constexpr in the second version? – Michael Apr 25 '17 at 10:10
  • @Gibet: You should look up what `if constexpr` does. – Lightness Races in Orbit Apr 25 '17 at 10:15
  • @Jonas Are you using a modern compiler? icc? gcc? clang? msvc? If yes, it will be optimized away. If no, then check. – Yakk - Adam Nevraumont Apr 25 '17 at 13:54
  • 1
    @Gibet For your C++17 example, it would be better if at the end of the `if constexpr` chain, you add a `else { static_assert(false);}`. This way it cannot be templated with unintended values. – KevinZ Apr 26 '17 at 15:49
  • 1
    I think the first part of the answer might add inline function definitions giving separate names to `MyFunction` and to `MyFunction`, so that the caller is freed from the need to provide explicit template arguments and can indeed forget that she is calling a function template at all. – Marc van Leeuwen Apr 27 '17 at 09:24
55

With template and lambda, you may do:

template <typename F>
void common(F f)
{
  foo();
  bar();
  f();
  foobar();
}

void MyFunction()
{
    common([](){});
}

void MyFunctionWithABonus()
{
  common(&doBonusStuff);
}

or else you can just create prefix and suffix function.

void prefix()
{
  foo();
  bar();
}

void suffix()
{
    foobar();
}

void MyFunction()
{
    prefix();
    suffix();
}

void MyFunctionWithABonus()
{
    prefix();
    doBonusStuff();
    suffix();
}
Jarod42
  • 203,559
  • 14
  • 181
  • 302
  • 13
    I actually prefer these two solutions over a boolean parameter (template or otherwise), regardless of any execution time advantages. I dislike boolean parameters. – Chris Drew Apr 25 '17 at 08:56
  • 2
    From my understanding the second solution will have additional runtime due to additional function call. Is this the case for the first one ? I'm not sure how lambdas work in that case – plougue Apr 25 '17 at 09:54
  • 10
    If definitions are visible, compiler would probably inline code and generate same code as the one generated for your original code. – Jarod42 Apr 25 '17 at 10:17
  • @ChrisDrew The problem is that the code no longer exists where it will run. If you have ever done per-pixel operations this way, the advantages are quite large to have code where it runs. – Yakk - Adam Nevraumont Apr 25 '17 at 13:55
  • 1
    @Yakk I think it will depend on the particular use case and whose responsibility the "bonus stuff" is. Often I find having bool parameters, ifs and bonus stuff in amongst the main algorithm makes it harder to read and would prefer it to "no longer exist" and be encapsulated and injected in from elsewhere. But I think the question of when it is appropriate to use the Strategy Pattern is probably beyond the scope of this question. – Chris Drew Apr 25 '17 at 14:26
  • @plougue [With a bit of work](http://coliru.stacked-crooked.com/a/f922b8e895409be1), you can rewrite the second solution to [use tail call optimisation](https://godbolt.org/g/fe4OZp), which would speed it up a little. [Note that this can also be done without using a `bool` template parameter, I used one for convenience. I also don't believe `return` chaining like that is strictly necessary, but it helps communicate intent, and prevents you from accidentally changing one part's return type without changing the others to match.] – Justin Time - Reinstate Monica Apr 25 '17 at 21:02
  • 2
    Tail call optimization usually matters when you want to optimize recursive cases. In this case, plain inlining... does everything you need. – Yakk - Adam Nevraumont Apr 26 '17 at 02:15
27

Given some of the comments the OP has made regarding debugging, here's a version that calls doBonusStuff() for debug builds, but not release builds (that define NDEBUG):

#if defined(NDEBUG)
#define DEBUG(x)
#else
#define DEBUG(x) x
#endif

void MyFunctionWithABonus()
{
  foo();
  bar();
  DEBUG(doBonusStuff());
  foobar();
}

You can also use the assert macro if you wish to check a condition and fail if it is false (but only for debug builds; release builds will not perform the check).

Be careful if doBonusStuff() has side effects, as these side effects will not be present in release builds and may invalidate assumptions made in the code.

Cornstalks
  • 37,137
  • 18
  • 79
  • 144
  • The warning about side effects is good, but it is also true no matter what construct is used, be it templates, if(){...}, constexpr, etc. – pipe Apr 27 '17 at 00:43
  • Given the OP comments, I upvoted this myself because it's exactly the best solution for them. That said, just a curiosity: why all the complications with the new defines and everything, when you can just put the doBonusStuff() call inside a #if defined(NDEBUG) ?? – motoDrizzt Apr 27 '17 at 08:52
  • @motoDrizzt: If the OP wants to do this same thing in other functions, I find introducing a new macro like this cleaner/easier to read (and write). If it's just a one-time thing, then I agree just using `#if defined(NDEBUG)` directly is probably easier. – Cornstalks Apr 27 '17 at 11:45
  • @Cornstalks yep, it totally make sense, I didn't think about it that away. And I'm still thinking this should be the accepted answer :-) – motoDrizzt Apr 27 '17 at 11:50
18

Here is a slight variation on Jarod42's answer using variadic templates so the caller can provide zero or one bonus functions:

void callBonus() {}

template<typename F>
void callBonus(F&& f) { f(); }

template <typename ...F>
void MyFunction(F&&... f)
{
  foo();
  bar();
  callBonus(std::forward<F>(f)...);
  foobar();
}

Calling code:

MyFunction();
MyFunction(&doBonusStuff);
Chris Drew
  • 14,926
  • 3
  • 34
  • 54
11

Another version, using only templates and no redirecting functions, since you said you didn't want any runtime overhead. As fas as I'm concerned this only increases compile time:

#include <iostream>

using namespace std;

void foo() { cout << "foo\n"; };
void bar() { cout << "bar\n"; };
void bak() { cout << "bak\n"; };

template <bool = false>
void bonus() {};

template <>
void bonus<true>()
{
    cout << "Doing bonus\n";
};

template <bool withBonus = false>
void MyFunc()
{
    foo();
    bar();
    bonus<withBonus>();
    bak();
}

int main(int argc, const char* argv[])
{
    MyFunc();
    cout << "\n";
    MyFunc<true>();
}

output:
foo
bar
bak

foo
bar
Doing bonus
bak

There's now only one version of MyFunc() with the bool parameter as a template argument.

Sebastian Stern
  • 632
  • 4
  • 15
  • Doesn't it add compile time by calling bonus() ? Or does the compiler detect that bonus is empty and doesn't run the function call ? – plougue Apr 25 '17 at 09:50
  • 1
    `bonus()` invokes the default version of the `bonus` template (lines 9 and 10 of the example), so there is no function call. To put it another way, `MyFunc()` compiles to one block of code (with no conditionals in it) and `MyFunc()` compiles to a different block of code (with no conditionals in it). – David K Apr 25 '17 at 13:01
  • 6
    @plougue templates are implicitly inline, and inlined empty functions don't do anything and can be eliminated by the compiler. – Yakk - Adam Nevraumont Apr 25 '17 at 13:56
8

You can use tag dispatching and simple function overload:

struct Tag_EnableBonus {};
struct Tag_DisableBonus {};

void doBonusStuff(Tag_DisableBonus) {}

void doBonusStuff(Tag_EnableBonus)
{
    //Do bonus stuff here
}

template<class Tag> MyFunction(Tag bonus_tag)
{
   foo();
   bar();
   doBonusStuff(bonus_tag);
   foobar();
}

This is easy to read/understand, can be expanded with no sweat (and no boilerplate if clauses - by adding more tags), and of course will leave no runtime footprint.

The calling syntax it quite friendly as it is, but of course can be wrapped into vanilla calls:

void MyFunctionAlone() { MyFunction(Tag_DisableBonus{}); }
void MyFunctionBonus() { MyFunction(Tag_EnableBonus{}); }

Tag dispatching is a widely used generic programming technique, here is a nice post about the basics.

Ap31
  • 3,244
  • 1
  • 18
  • 25