Tail-recursion not happening

Question

I'm using g++ (Ubuntu 4.8.2-19ubuntu1) 4.8.2 in a C++ project. I wrote a function that kind of does this:

template<typename T, T (*funct)(int) >
multiset<T> Foo(const multiset<T>& bar, int iterations) {
    if (iterations == 0) return bar; 
    multiset<T> copy_bar(bar); 

    T baz = funct(copy_bar.size());

    if (baz.attr > 0)
        return Foo<T,funct>(copy_bar, 100);
    else 
        return Foo<T,funct>(bar, iterations - 1);    
}

I was getting bad_alloc() exception so I tested the function with gdb and turns out that there's no tail-recursion happening, which I was expecting since there are no statements after the returns.

NOTE: I tried with -O2 compilation flag but it didn't work

What is the input of funct? Without this function, hard to tell what is wrong — Matt, May 06 '15 at 21:22
@Matt I think `funct` is not relevant in this case. When I run `gdb` the stack has only calls to Foo because `funct` was called and then it returned. — Jcao02, May 06 '15 at 21:31
What happens if you assign variables in your `if/else` statement and only have one call to `Foo` at the end? — Red Alert, May 06 '15 at 21:42

score 3 · Answer 1 · answered May 06 '15 at 21:50

3

Your function is not tail-recursive, since there's still work to do after the recursive call: Destruction of copy_bar (which has a non-trivial destructor) and possibly also of baz (if the type T also has a non-trivial destructor).

answered May 06 '15 at 21:50

celtschk

19,311
3
39
64

score 2 · Answer 2 · edited May 23 '17 at 12:13

As indicated by @celtschk's answer, non-trivial destructors are preventing the compiler from treating the calls as truly tail-recursive. Even when your function is reduced to this:

template<typename T, T (*funct)(int) >
multiset<T> Foo(const multiset<T>& bar, int iterations) {
    if (iterations == 0) return bar;
    return Foo<T,funct>(bar, iterations - 1);
}

It is still not tail-recursive, because of the implicit calls to the non-trivial constructors and destructors of the result of the recursive function call.

Note, however, that the above function does become tail-recursive with a relatively minor change:

template<typename T, T (*funct)(int) >
const multiset<T>& Foo(const multiset<T>& bar, int iterations) {
    if (iterations == 0) return bar;
    return Foo<T,funct>(bar, iterations - 1);
}

Voila! The function now compiles down into a loop. How can we achieve this with your original code?

Unfortunately, it is a little tricky, because your code will sometimes return the copy, and sometimes the original argument. We have to correctly deal with both cases. The solution presented below is a variation of the idea David mentioned in his comments to his answer.

Assuming you want to keep your original Foo signature the same (and, since it sometimes returns the copy, there is no reason to believe you wouldn't want to leave the signature the same), we create a secondary version of Foo called Foo2 that returns a reference to a result object that is either the original bar parameter, or one local in your Foo. In addition, Foo creates place holder objects for the copy, a secondary copy (for switching), and one to hold the funct() call result:

template<typename T, T (*funct)(int) >
multiset<T> Foo(const multiset<T>& bar, int iterations) {
    multiset<T> copy_bar;
    multiset<T> copy_alt;
    T baz;
    return Foo2<T, funct>(bar, iterations, copy_bar, copy_alt, baz);
}

Since Foo2 will always end up returning a reference, this removes any issues with the recursive function result causing implicit construction and destruction.

Upon each iteration, if the copy is to be used as the next bar, we pass the copy in, but switch the order of the copy and alternate placeholders for the recursive call, so that the alternate is actually used to hold the copy on the next iteration. If the next iteration reuses bar as is, the order of the arguments don't change, the iterations counter is merely decremented.

template<typename T, T (*funct)(int) >
const multiset<T>& Foo2(
        const multiset<T>& bar, int iterations,
        multiset<T>& copy_bar, multiset<T>& copy_alt, T& baz) {
    if (iterations == 0) return bar;
    copy_bar = bar;
    baz = funct(copy_bar.size());
    if (baz.attr > 0) {
        return Foo2<T, funct>(copy_bar, 100, copy_alt, copy_bar, baz);
    } else {
        return Foo2<T, funct>(bar, --iterations, copy_bar, copy_alt, baz);
    }
}

Notice that only the original call to Foo pays the penalty of construction and destruction of the local objects, and Foo2 is now entirely tail-recursive.

David Rodríguez - dribeas · Answer 3 · 2015-05-06T22:53:42.433

I don't think @celschk is right, but rather @jxh in an answer that he wiped out, so lets revisit it.

The fact that there are local variables that need to be destroyed does not affect tail recursion in general. The optimization is turning recursion into a loop, and those variables can be internal to the loop and be destroyed in each pass.

The problem, I believe, stems from the fact that the argument is a reference and that depending on some condition each pass over the loop would have to refer to either an object outside of the function or the local copy inside this function. If you try to manually unroll the recursion into a loop you will find it quite hard to figure out what should the loop look like.

To transform the recursion into a loop you would have to create an additional variable outside of the loop to hold the value of the multimap, turn the reference into a pointer and update the pointer to one or the other object depending on the condition before jumping back to the beginning of the loop. The baz variable cannot be used for this (i.e. it cannot be pulled outside of the loop) as each pass makes a copy and I imagine some other transformations that you did not show in the code above, so you really need to create an additional variable. The compiler cannot create new variables for you.

At this point, I have to admit that yes, the issue here is that for one of the branches copy_var needs to be destroyed after the recursion completes (as a reference to it is passed down), so @celtschk is not 100% wrong... but he is when he points at baz as another potential reason for breaking the tail-recursion.

Note: You can do this transformation manually: create a variable, empty, outside of the loop; whenever the condition is true you can swap the contents of the variable inside the loop with the variable outside of the loop (which is quite efficient) and adjust the pointer. If you know that you will do multiple passes and thus will need multiple copies, you can also take the value by non-const ref and provide a façade that will copy the user's argument before forwarding to this one, then internally you can blast (swap) the value of the argument before recursing with the same argument you got. — David Rodríguez - dribeas, May 06 '15 at 22:45
Note2: If the code is really as you showed it above... well, you don't need a copy of the argument to call `size()`, you can remove the object altogether and make your life (and the compiler's) easier. — David Rodríguez - dribeas, May 06 '15 at 22:49
The issue with my answer is that when I tested it with always using the false case (in which case, the same reference is being passed in on every recursive call), tail recursion was still not occurring. The only thing left is the destructor code for `copy_bar`. — jxh, May 06 '15 at 23:02
... and the constructors/destructors associated with the recursive function return value. — jxh, May 06 '15 at 23:07

Tail-recursion not happening

3 Answers3

Linked