1

I have some templated code that compilers can tail call optimize for most data types but not others. The code implements pow()

template<typename T, typename U>
void powRecurse(T& x, U& y, T& acc)
{
   if(y == 0) {
      acc = Identity<T>;
      return;
   }
   if(y == 1) {
      acc = acc * x;
      return;
   }
   if(y % 2 == 1) {
      acc = acc * x;
      y = y - 1;
   }
   x = x*x; 
   y = y/2;
   powRecurse<T, U>(x, y, acc);
}

template<typename T, typename U>
T tailPow(T x, U y)
{
   T rv = Identity<T>;
   powRecurse<T, U>(x, y, rv);
   return rv;
}

The type of parameter T seems to have no affect on tail call optimization, any types I've tried can be tail called optimized with the right types for the parameter U. If parameter U is a uint64_t the compiler can tail call optimize. If it's boost::multiprecision::cpp_int then the compiler doesn't tail call optimize.

I've also tried wrapping a uint64_t in a class and a template wrapper on an int type which both tail call optimize.

Is there any reason this shouldn't tail call optimize? Obviously I could loop this but I'm really just trying to understand the language or compiler issues here.

Michael Conlen
  • 1,959
  • 2
  • 17
  • 19
  • 1
    Obvious, but probably useless comment: compiler has no obligations in optimizations and they are, afaik, a bunch of heuristics, so if `cpp_int` is "big enough" then compiler may think that tail recursion does not worth it. It's quite interesting why, but I doubt it can be answered without good knowledge in gcc's (or other compiler's) internals. – yeputons Feb 07 '17 at 01:49
  • @yeputons thanks, mostly I'm not sure that there isn't something in cpp_int that makes it impossible. On the other hand I can't imagine a case where tail call optimization would be worse, though I can imagine it being an issue with length of the function after inlining and an inability of the analyzer to identify the possibility of optimization. – Michael Conlen Feb 07 '17 at 13:54

1 Answers1

1

Lazy evaluation implies that all the "recursive" function calls are actually different function template instantiations.

See e.g. a similar discussion here:

So, you can have all the tail-coll implementation details as you expect it by opting out of the lazy-evaluation template expressions (boost::multiprecision::et_off as described in the links), but be sure to check that the reduced code-size and perceived TCO optimization actually leads to increased performance.

In reality, certain algorithms benefit from expression templates that can skip common sub expressions, reorder lossless operations etc.

sehe
  • 374,641
  • 47
  • 450
  • 633
  • I see! I think this may have also been why originally I had to specify template parameters to for function calls to avoid mismatched types with some of the arguments I passed. – Michael Conlen Feb 08 '17 at 01:42
  • Yes. Alternatively you can explicitly cast the arguments before passing. It's a common surprise with users using Boost Multiprecision. I wish they'd make it a bit more obvious in their documentation that `T a,b; auto c = a+b;` doesn't imply the type of `c` is `T` like with your grandfathers C++ types :) – sehe Feb 08 '17 at 01:45
  • It appears this did not address the issue though I'm not sure yet that it isn't a factor. – Michael Conlen Feb 08 '17 at 01:56