Why does using boost::multiprecision::cpp_int affect tail call optimization here

Question

I have some templated code that compilers can tail call optimize for most data types but not others. The code implements pow()

template<typename T, typename U>
void powRecurse(T& x, U& y, T& acc)
{
   if(y == 0) {
      acc = Identity<T>;
      return;
   }
   if(y == 1) {
      acc = acc * x;
      return;
   }
   if(y % 2 == 1) {
      acc = acc * x;
      y = y - 1;
   }
   x = x*x; 
   y = y/2;
   powRecurse<T, U>(x, y, acc);
}

template<typename T, typename U>
T tailPow(T x, U y)
{
   T rv = Identity<T>;
   powRecurse<T, U>(x, y, rv);
   return rv;
}

The type of parameter T seems to have no affect on tail call optimization, any types I've tried can be tail called optimized with the right types for the parameter U. If parameter U is a uint64_t the compiler can tail call optimize. If it's boost::multiprecision::cpp_int then the compiler doesn't tail call optimize.

I've also tried wrapping a uint64_t in a class and a template wrapper on an int type which both tail call optimize.

Is there any reason this shouldn't tail call optimize? Obviously I could loop this but I'm really just trying to understand the language or compiler issues here.

Obvious, but probably useless comment: compiler has no obligations in optimizations and they are, afaik, a bunch of heuristics, so if `cpp_int` is "big enough" then compiler may think that tail recursion does not worth it. It's quite interesting why, but I doubt it can be answered without good knowledge in gcc's (or other compiler's) internals. — yeputons, Feb 07 '17 at 01:49
@yeputons thanks, mostly I'm not sure that there isn't something in cpp_int that makes it impossible. On the other hand I can't imagine a case where tail call optimization would be worse, though I can imagine it being an issue with length of the function after inlining and an inability of the analyzer to identify the possibility of optimization. — Michael Conlen, Feb 07 '17 at 13:54

score 1 · Answer 1 · answered Feb 07 '17 at 15:17

1

Lazy evaluation implies that all the "recursive" function calls are actually different function template instantiations.

See e.g. a similar discussion here:

So, you can have all the tail-coll implementation details as you expect it by opting out of the lazy-evaluation template expressions (boost::multiprecision::et_off as described in the links), but be sure to check that the reduced code-size and perceived TCO optimization actually leads to increased performance.

In reality, certain algorithms benefit from expression templates that can skip common sub expressions, reorder lossless operations etc.

answered Feb 07 '17 at 15:17

sehe

374,641
47
450
633

I see! I think this may have also been why originally I had to specify template parameters to for function calls to avoid mismatched types with some of the arguments I passed. – Michael Conlen Feb 08 '17 at 01:42
Yes. Alternatively you can explicitly cast the arguments before passing. It's a common surprise with users using Boost Multiprecision. I wish they'd make it a bit more obvious in their documentation that `T a,b; auto c = a+b;` doesn't imply the type of `c` is `T` like with your grandfathers C++ types :) – sehe Feb 08 '17 at 01:45
It appears this did not address the issue though I'm not sure yet that it isn't a factor. – Michael Conlen Feb 08 '17 at 01:56

Why does using boost::multiprecision::cpp_int affect tail call optimization here

1 Answers1

Linked