Overhead of calling std::function::Operator()

Question

I have a use case where based on the reading of a const member variable, I will switch between two different logics:

void foo(){
  if(member_var_){ // const, initialized at object creation
     a();
  } else {
     b();
  }
}

This foo() is being reference many many times in my code, so I was wondering instead of branching based on member_var at runtime, I might be able to eliminate this step at object construction:

private std::function<void()> action_; // foo() just calls action_()
// Constructor
if(member_var_){
   action_ = a;
} else {
   action_ = b;
}

However, I see 7x performance drop in my benchmark, I was expecting some performance lost due to the redirection (an extra function call overhead and some code cache miss), but 7 times is a little surprising, I couldn't find anything alarming in the source code, maybe someone has insight on this? also is there no better solution than dynamically checking the bool variable over and over?

Your image shows "optim = None". That means you're benchmarking `gcc -O0` anti-optimized debug mode code, which is useless and tells you nothing about the cost in a real program. C++ library template functions need optimization to expand and inline. See [Idiomatic way of performance evaluation?](https://stackoverflow.com/q/60291987) for basics and [C loop optimization help for final assignment (with compiler optimization disabled)](https://stackoverflow.com/a/32001196) and [this](https://stackoverflow.com/q/53366394) for details on why `-O0` hugely distorts things — Peter Cordes, Aug 27 '20 at 01:04
If `a()` and `b()` are declared with the same signature, then a simple member pointer will have less overhead than a `std::function`, eg: `private void (MyClass::*)() action_;` ... `MyClass() { ... action_ = (member_var_) ? &MyClass::a : &MyClass::b; ... }` ... `void foo(){ (this->*action_)(); }` — Remy Lebeau, Aug 27 '20 at 01:39
Also, what are `a()` and `b()`? If they're both simple, they could inline into the `if` and if-conversion to branchless could simplify the whole thing to remove a hard-to-predict conditional branch. Or optimize away if you don't use the result. (Of course this will only happen with optimization enabled, but it could make it cheaper than predicting an actual indirect branch through a function pointer, which probably won't optimize away because it's set separately from the function using it. Plus it would defeat inlining in general) — Peter Cordes, Aug 27 '20 at 02:00
@PeterCordes, I've update the question with optimization flag, but the truth is whenever I have optimization turned whichever level it is I can only see greater performance gap — watashiSHUN, Aug 27 '20 at 17:30
`-O1` is only partial optimization; it doesn't include function inlining, IIRC. Use at least `-O2`, preferably `-O3`. If the results are still interesting with normal optimization levels, I'd be happy to reverse my downvote. — Peter Cordes, Aug 27 '20 at 20:27
in my dummy test, `a()` and `b()` are both NOOP, turning on O3 yields [this](https://quick-bench.com/q/iC4XJ4bj8WCQYPmUxrfIGI28BKc) — watashiSHUN, Aug 27 '20 at 20:44
Well there you go, you've proved that `std::function` is defeating full optimization and removal of a do-nothing loop for that trivial case. But that the other ways do still allow optimization when the compiler can prove it does nothing. So from this we can conclude that `std::function` has potential real overhead, and the tradeoff will depend on the details of what you're branching between. A compare/branch is generally as cheap as a memory-indirect branch, if not cheaper, if they both compile to asm that mostly works as written. (plus extra overhead on std::function for func signature) — Peter Cordes, Aug 28 '20 at 01:37

Overhead of calling std::function::Operator()

0 Answers0