3

Consider a typical finite difference application:

// assuming T_size > 2
void process_T(double *T0, double *T, const int &T_size, bool periodic) {
  for (int i = 0; i < T_size; ++i) {
    double sum = 0;
    double base = T0[i];
    if (i > 0) sum += (T0[i-1]-base);
    if (i < 0) sum += (T0[i+1]-base);
    if (periodic) { 
       if (i == 0) sum += (T0[T_size-1]-base);
       if (i == T_size-1) sum += (T0[0]-base);
    } else {
      if (i == 1 || i == T_size-1) sum += 0.5*(T0[i-1]-base);
      if (i == 0 || i == T_size-2) sum += 0.5*(T0[i+1]-base);
    }
    T[i] = T0[i] + sum * 0.08; // where 0.08 is some magic number
  }
}

The check for periodic is loop-invariant, but since is only known at run-time, the conditional check cost is incurred everytime. I could create a specialized function which assumes one of the cases, but it would be cumbersome to maintain the common base, especially in case of three-dimensional problem where it would grow to 8 functions (periodicity: none, x, y, z, xy, xz, yz, xyz) to consider all combinations.

Is it possible to solve this problem via metaprogramming?

P/S: can the branch predictor optimize this accordingly?

syockit
  • 5,747
  • 1
  • 24
  • 33
  • Related question: http://stackoverflow.com/questions/17612337/avoiding-conditionals-and-function-invocations-inside-a-loop – ChronoTrigger Oct 29 '13 at 11:07

2 Answers2

2

Templates may have non-type parameters:

template <bool periodic> void process_T(double *T0, double *T, const int &T_size)

Of course this implies a cost of writing something like this at the call site:

bool periodicFunction = {whatever};
if (periodicFunction)
    process_T<true>(...);
else
    process_T<false>(...);
Anton
  • 978
  • 5
  • 11
  • Wouldn't that still mean that I'd be maintaining two specialized functions? Or does the compiler optimize the unrelated cases out upon instantiation of the functions? – syockit Oct 29 '13 at 08:42
  • Any decently sophisticated compiler will optimize these cases. At least MSVS2013 does completly eliminate code from `if (periodic &&` or `if (!periodic)` branches. – Anton Oct 29 '13 at 08:52
  • Your answer works magic! When I tested on a 2d finite difference code, for longer iterations, templated code is significantly faster than the one relying on parameters (although for some reason, there was no speed up in case of `periodic` being false). I hope there's a way to deal with the ugliness of the call site though. Isn't there a way to metaprogram around that? – syockit Oct 29 '13 at 11:08
  • 1
    Only ugly ones. Look at these questions: http://stackoverflow.com/questions/11023384/transpose-template-function-boolean-arguments-to-runtime-function-arguments-with and http://stackoverflow.com/questions/2873802/specify-template-parameters-at-runtime – Anton Oct 29 '13 at 11:13
2

Yes, you could have

enum Periodicity
{
    PERIODICITY_NONE,
    PERIODICITY_X,
    PERIODICITY_Y
    // etc
};

and then

template <Periodicity P>
void process_T(double* T0, double* T, const int& T_size)
{
    if (P == PERIODICITY_NONE) // ... do something
    if (P == PERIODICITY_X) // ... do something else

    // Common code
}

Any decent optimising compiler would be able to perform the check at compile time, and would eliminate any dead code (g++ appears to do this even at -O0).

Tristan Brindle
  • 16,281
  • 4
  • 39
  • 82
  • 1
    If you want to be dead sure about the optimization taking place you can move the "do something code" to a functor and use specialization to select the right piece of code. – pmr Oct 29 '13 at 09:36
  • Sure, but then you're relying on the compiler inlining the functor's `operator()`... easy enough to check whether it's happening though I guess. Plus it's more typing! :-) – Tristan Brindle Oct 29 '13 at 09:49