0

I'm using some template meta-programming to solve a small problem, but the syntax is a little annoying -- so I was wondering, in the example below, will overloading operators on the meta-class that has an empty constructor cause a (run-time) performance penalty? Will all the temporaries actually be constructed or can it be assumed that they will be optimized out?

template<int value_>
struct Int {
    static const int value = value_;

    template<typename B>
    struct Add : public Int<value + B::value> {  };

    template<typename B>
    Int<value + B::value> operator+(B const&) { return Int<value + B::value>(); }
};

int main()
{
    // Is doing this:
    int sum = Int<1>::Add<Int<2> >().value;

    // any more efficient (at runtime) than this:
    int sum = (Int<1>() + Int<2>()).value;

    return sum;
}
Cameron
  • 96,106
  • 25
  • 196
  • 225
  • `Int<1>::Add>().value` can be re-written as `Int<1>::Add>::value`. – Nawaz Feb 25 '13 at 07:33
  • Oh. I seem to be creating [expression templates](http://stackoverflow.com/a/2598596/21475) without even knowing it. Seems the optimizers are pretty good at handling this sort of thing! – Cameron Feb 25 '13 at 07:34
  • @Nawaz: You're right of course, that's an artifact of the design of my real code (which eventually needs an instance at run-time). – Cameron Feb 25 '13 at 07:35
  • I believe `Add<>` approach is likely to be faster, or at least not slower than `+` approach. – Nawaz Feb 25 '13 at 07:37

2 Answers2

2

Alright, I tried my example under GCC.

For the Add version with no optimization (-O0), the resulting assembly just loads a constant into sum, then returns it.

For the operator+ version with no optimization (-O0), the resulting assembly does a bit more (it appears to be calling operator+).

However, with -O3, both versions generate the same assembly, which simply loads 3 directly into the return register; the temporaries, function calls, and sum had been optimized out entirely in both cases.

So, they're equally fast with a decent compiler (as long as optimizations are turned on).

Cameron
  • 96,106
  • 25
  • 196
  • 225
0

Compare assembly code generated by g++ -O3 -S for both solutions. It gives same code for both solutions. It actually optimize code to simply return 3.