Avoiding conditionals and function invocations inside a loop

Question

I have a code that looks like this:

void function(int parameter)
{
  for( ... ) // a big loop
  {
    double a = ...;
    for( ... ) // a big loop
    {
      double b = ...;

      double value;
      if(parameter == 1)
        value = some_math_expression_1(a, b);
      else if(parameter == 2)
        value = some_math_expression_2(a, b);
      ...
    }
  }
}

The idea is that depending on the parameter I want to apply some math expression to a and b. This function is executed many times and must be fast, and I wonder if those conditional branches at each iteration can introduce an overhead that I could save.

Right now, I have written the code like this:

void function(int parameter)
{
  if(parameter == 1)
    function1();
  else if(parameter == 2)
    function2();
  else
    ...
}

So that I can apply the math expression directly if I repeat the code in every functionX(). The obvious problem is that when I want to change some piece of code I have to do it several times (I have around 10 math expressions now).

What approach could I use to avoid any overhead in function?

What if I pass a pointer to functions some_math_expression_X to function (I would change the conditionals for function invocations)?

What if I code the whole function as a macro (uf) and set the math expression as a parameter?

What if I use a template and pass the math expression as a pointer to an inline function (is this even possible)?

EDIT: Thank you for your answers. I know I can use the methods you are proposing (pointers to / array of functions, or relying on the branch predictor). However, do you have some insight about what would be better in terms of avoiding overhead? The math expressions are quite simple (something like a*b), and in addition to the loops that are long, function is also called many times (do branch predictions "survive" between calls?).

score 5 · Accepted Answer · answered Jul 12 '13 at 10:33

You can convert the function into a template:

void functionT<int PARAMETER>()
{
  for( ... ) // a big loop
  {
    double a = ...;
    for( ... ) // a big loop
    {
      double b = ...;

      double value;
      if(PARAMETER == 1) //Constant condition!!!
        value = some_math_expression_1(a, b);
      else if(PARAMETER == 2)  //Constant condition!!!
        value = some_math_expression_2(a, b);
      ...
    }
  }
}

Since the conditions are always true or always false, the compiler will optimize the condition tree away and leave only the real math expression. No branches and no function calls!

Now, you can use it only with constant parameters:

functionT<1>();

But not with variables:

int x = 1;
functionT<x>(); //Error

If you need that, you can make a wrapper:

void function(int parameter)
{
    switch (parameter)
    {
        case 1: functionT<1>(); break;
        case 2: functionT<2>(); break;
    }
}

Great answer, @rodrigo. I was reading a similar solution in this other thread, that may be useful for someone else: http://stackoverflow.com/questions/16871471/avoiding-if-statement-inside-a-for-loop — ChronoTrigger, Jul 12 '13 at 10:44

score 3 · Answer 2 · answered Jul 12 '13 at 10:11

3

Don't worry. Modern CPU's have branch predictors, and they will correctly predict the branch taken.

answered Jul 12 '13 at 10:11

MSalters

173,980
10
155
350

And it evidently seems to make a huge difference when they try to, but fail often :) – chris Jul 12 '13 at 10:22
2

`Modern CPU's have branch predictors` is true. `and they will correctly predicty` is almost certainly not true, especially for a random, 50:50 probability branch. – Salgar Jul 12 '13 at 10:25
1

@Salgar: But that is entirely not the case here. The branch is fully predictable by `parameter`, which doesn't change throughout the nested loops. If it's (only) a thousand iterations, you'll already hit 99%+ correct predictions. – MSalters Jul 12 '13 at 10:30
That is true apologies, I was looking at `function()` which moved the loop out. – Salgar Jul 12 '13 at 10:32
If I call `function()` several times, with the same parameter, does the branch predictor have to start from scratch to make predictions? – ChronoTrigger Jul 12 '13 at 10:38
1

@ChronoTrigger, generally speaking no. The branch predictor only works at the instruction level. If there is an entry in the history for a particular branch instruction that entry will be used. If there's no entry, the predictor will guess based on certain heuristics: Forward conditional branches are NOT taken (step into IF/FOR statements), Backward conditional branches are taken (always loop for WHILE statements), take unconditional branches (always call functions, follow goto statements). – Jul 12 '13 at 11:04

score 1 · Answer 3 · answered Jul 12 '13 at 10:11

You could set up a constant array of function pointers, and call the function associated with parameter.

But if the math expressions are rather small, a switch() statement could be even faster.

switch (parameter) {
    case 1:
        value = math expression 1;
        break;
    case 2:
        ...
}

score 1 · Answer 4 · answered Jul 12 '13 at 10:12

Firstly, I would as always say that you should Benchmark/Measure how long this process takes right now, because as always, this could be premature optimization, and you might find out this isn't the part of your code that is taking a long time.

But assuming you have measured and found that this is the bottleneck in your code, there are a few things that I would do.

Firstly, the thing that is going to kill you most here (provided your math functions are sufficiently simple) is the branch prediction, as you have said. So to get rid of the branching I would create an array of function pointers, and instead of doing

if(parameter == 1)
    function1();
if...

You can just do:

array_of_functions[parameter]();

This will get rid of all branch predicition and will increase the throughput greatly because your pipeline will not have to be flushed. The compiler also should be able to inline the functions.

score 0 · Answer 5 · edited May 23 '17 at 12:05

It depends on many things but in general you may want to make it so that most of the time either the first or the second function is called consecutively. This would make modern CPU execute this quite faster (see Why is it faster to process a sorted array than an unsorted array?).

You could use array and function pointers but that may not speed up speed, need to try. You can use http://www.boost.org/doc/libs/1_54_0/doc/html/function/tutorial.html#idp59212272 to help but you don't need it for static functions.

score 0 · Answer 6 · answered Jul 12 '13 at 10:26

I think one of the most efficient ways to do this is to create an array of function pointers, and then you can directly pass the function pointer instead of just the parameter. This would save any kind of overhead that you would incur using if/switch statement in a nested loop.

As an example:

double expression_0(double a, double b) {...};
double expression_1(double a, double b) {...};

void function(double (*expression)(double, double)) {
    for (...) {
        ...
        double a = ...;
        for (...) {
            double b = ...;
            double result = (*expression)(a, b);
        }
    }
}

int main() {
    double (*fpointers[2]) (double, double);
    fpointers[0] = expression_0;
    fpointers[1] = expression_1;

    int parameter = ...;
    function(fpointers[parameter]);
}

Should the function invocation be cheaper than the conditionals? — ChronoTrigger, Jul 12 '13 at 10:39
I'm certain that at the assembly level, this should be at least one instruction more efficient than making correct branch prediction every time since there's no conditional to be evaluated, while the function invocation should be the same as calling a normal function. — cheeyos, Jul 12 '13 at 18:23

score 0 · Answer 7 · answered Jul 12 '13 at 10:29

0

If all your functions have the same signature, then the simplest thing to do would be this:

void function(int parameter)
{
  double ( *fn )( double, double );

  switch( parameter )
  {
    case 1:  fn = &some_math_expression_1;  break;
    case 2:  fn = &some_math_expression_2;  break;
    ...
  }

  for( ... ) // a big loop
  {
    double a = ...;
    for( ... ) // a big loop
    {
      double b = ...;
      double value = fn( a, b );
      ...
    }
  }
}

answered Jul 12 '13 at 10:29

As I asked @cheeyos, do you think that the function invocation is cheaper than the conditionals? – ChronoTrigger Jul 12 '13 at 10:40
You're calling the function regardless. Moving the conditions outside the loop will benefit you - exactly how much will depend on too many external factors. Notably, how large the branch prediction cache is on your target processor, and how much it's used inside your math functions. Obviously reducing the amount of processing you have to do will more than likely have a much larger impact on performance than something as trivial as this. – Jul 12 '13 at 10:56
Obviously a better solution would be to get rid of your `parameter` argument and only deal with function pointers everywhere you can. Moving the loops inside each function would also give the optimiser a better chance at vectorising and/or parallelising the actual processing code - but that might not be as easy as it sounds depending on your exact situation. – Jul 12 '13 at 11:10

Avoiding conditionals and function invocations inside a loop

7 Answers7

Linked