In functional languages, how is the compiler able to translate non-tail recursion into loops to avoid stack overflows (if at all)?

Question

I've been recently learning about functional languages and how many don't include for loops. While I don't personally view recursion as more difficult than a for loop (and often easier to reason out) I realized that many examples of recursion aren't tail recursive and therefor cannot use simple tail recursion optimization in order to avoid stack overflows. According to this question, all iterative loops can be translated into recursion, and those iterative loops can be transformed into tail recursion, so it confuses me when the answers on a question like this suggest that you have to explicitly manage the translation of your recursion into tail recursion yourself if you want to avoid stack overflows. It seems like it should be possible for a compiler to do all the translation from either recursion to tail recursion, or from recursion straight to an iterative loop with out stack overflows.

Are functional compilers able to avoid stack overflows in more general recursive cases? Are you really forced to transform your recursive code in order to avoid stack overflows yourself? If they aren't able to perform general recursive stack-safe compilation, why aren't they?

This question and its answer are definitely an interesting read: http://stackoverflow.com/questions/931762/can-every-recursion-be-converted-into-iteration — G_H, May 05 '17 at 09:46

score 2 · Answer 1 · edited Jun 20 '20 at 09:12

Tail Call Optimization:

The natural way to do arguments and calls is to sort out the cleaning up when exiting or when returning.

For tail calls to work you need to alter it so that the tail call inherits the current frame. Thus instead of making a new frame it massages the frame so that the next call returns to the current functions caller instead of this function, which really only cleans up and returns if it's a tail call.

Thus TCO is all about cleaning up before the last call.

Continuation Passing Style - make tail calls out of everything

A compiler can change the code such that it only does primitive operations and pass it to continuations. Thus the stack usage gets moved onto the heap since the computation to be continued is made a function.

An example is:

function hypotenuse(k1, k2) {
  return sqrt(add(square(k1), square(k2)))
}

becomes

function hypotenuse(k, k1, k2) {
  (function (sk1) {
    (function (sk2) {
      (function (ar) {
         k(sqrt(ar));
      }(add(sk1,sk2));
    }(square(k2));
  }(square(k1));
}

Notice every function has exactly one call now and the order of evaluation is set.

score 2 · Accepted Answer · answered May 05 '17 at 09:05

Any recursive function can be converted into a tail recursive one. For instance, consider the transition function of a Turing machine, that is the mapping from a configuration to the next one. To simulate the turing machine you just need to iterate the transition function until you reach a final state, that is easily expressed in tail recursive form. Similarly, a compiler typically translates a recursive program into an iterative one simply adding a stack of activation records.

You can also give a translation into tail recursive form using continuation passing style (CPS). To make a classical example, consider the fibonacci function. This can be expressed in CPS style in the following way, where the second parameter is the continuation (essentially, a callback function):

def fibc(n, cont):
    if n <= 1:
        return cont(n)
    return fibc(n - 1, lambda a: fibc(n - 2, lambda b: cont(a + b)))

Again, you are simulating the recursion stack using a dynamic data structure: in this case, lambda abstractions.

The use of dynamic structures (lists, stacks, functions, etc.) in all previous examples is essential. That is to say, that in order to simulate a generic recursive function iteratively, you cannot avoid dynamic memory allocation, and hence you cannot avoid stack overflow, in general.

So, memory consumption is not only related to the iterative/recursive nature of the program. On the other side, if you prevent dynamic memory allocation, your programs are essentially finite state machines, with limited computational capabilities (more interesting would be to parametrise memory according to the dimension of inputs).

In general, in the same way as you cannot predict termination, you cannot predict an unbound memory consumption of your program: working with a Turing complete language, at compile time you cannot avoid divergence, and you cannot avoid stack overflow.

score 0 · Answer 3 · answered May 04 '17 at 15:21

According to this question, all iterative loops can be translated into recursion

"Translated" might be a bit of a stretch. The proof that for every iterative loop there is an equivalent recursive program is trivial if you understand Turing completeness: since a Turing machine can be implemented using strictly iterative structures and strictly recursive structures, every program that can be expressed in an iterative language can be expressed in a recursive language, and vice-versa. This means that for every iterative loop there is an equivalent recursive construct (and the other way around). However, that doesn't mean we have some automated way of transforming one into the other.

and those iterative loops can be transformed into tail recursion

Tail recursion can perhaps be easily transformed into an iterative loop, and the other way around. But not all recursion is tail recursion. Here's an example. Suppose we have some binary tree. It consists of nodes. Each node can have a left and a right child and a value. If a node has no children, then isLeaf returns true for it. We'll assume there's some function max that returns the maximum of two values, and if one of the values is null it returns the other one. Now we want to define a function that finds the maximum value among all the leaf nodes. Here it is in some pseudo-code I cooked up.

findmax(node) {
    if (node == null) {
        return null
    }
    if (node.isLeaf) {
        return node.value
    } else {
        return max(findmax(node.left), findmax(node.right))
    }
}

There's two recursive calls in the max function, so we can't optimize for tail recursion. We need the results of both before we can supply them to the max function and determine the result of the call for the current node.

Now, there may be a way of getting the same result, using recursion and only a single tail-recursive call. It is functionally equivalent, but it is a different algorithm. Compilers can do a lot of transformations to create a functionally equivalent program with lots of optimizations, but they're not quite clever enough to create functionally equivalent algorithms.

Even the transformation of a function that only calls itself recursively once into a tail-recursive version would be far from trivial. Such an adaptation usually employs some argument passed into the recursive invocation that is used as an "accumulator" for the current results.

Look at the next naive implementation for calculating a factorial of a number (e.g. fact(5) = 5*4*3*2*1):

fact(number) {
    if (number == 1) {
        return 1
    } else {
        return number * fact(number - 1)
    }
}

It's not tail-recursive. But it can be made so in this way:

fact(number, acc) {
    if (number == 1) {
        return acc
    } else {
        return fact(number - 1, number * acc)
    }
}
// Helper function
fact(number) {
    return fact(number, 1)
}

This requires an interpretation of what is being done. Recognizing the case for stuff like this is easy enough, but what if you call a function instead of a multiplication? How will the compiler know that for the initial call the accumulator must be 1 and not, say, 0? How do you translate this program?

recsub(number) {
    if (number == 1) {
        return 1
    } else {
        return number - recsub(number - 1)
    }
}

This is as of yet outside the scope of the sort of compiler we have now, and may in fact always be.

Maybe it would be interesting to ask this on the computer science Stack Exchange to see if they know of some papers or proofs that investigate this more in-depth.

your second example is trivially transformed into a similar tail recursive version as your first, and i suspect that all recursive functions in the form: `function(args) { ... .. // final return always recursive return (inline operation and/or function) recsub(number - 1) }` can be. Also your notion that "compilers can't do this they just aren't smart enough!" seems to be coming from a directly machine code perspective, you can transform your code easier if you use intermediate representations (ie Rust, which has a lot of what you would consider "impossible" optimizations) — Krupip, May 04 '17 at 15:40
Also my post isn't about transforming to tail recursion, its about eliminating the stack overflow. If your recursion can be transformed into an iterative version, then it doesn't matter if it can be transformed into tail recursion or not, the whole point of tail recursion is that it can be trivially transformed into yet another form where no SOs take place. Its going to be examples where recursion doesn't happen on the return line which are difficult. — Krupip, May 04 '17 at 15:42
I'd go so far as to say that writing such transpilers is possible, but not desired. Programmers don't want the compiler to mess with their algorithm. Being "functionally equivalent" (i.e. giving the same results for the same inputs) doesn't cut it. We also expect the compiled program to be equivalent to the code in its performance characteristics, its debuggability, its memory usage, its code size etc. The more clever the optimisations become, the more detached the output will get from the input and the *predictability* becomes lost. — Bergi, May 04 '17 at 15:43
@Bergi, That is a good point, however I'm not entirely convinced that the transformations here wouldn't have the same characteristics. — Krupip, May 04 '17 at 16:41
@snb I'm not sure if I see how the subtraction example is trivially transformed into a tail-recursive version. You need to know the result of the recursive call before it can be subtracted, so the current call must remain on the stack.It doesn't matter than the operation/function can be inlined because the length of the expansion isn't known up-front, it depends on the given argument(s). — G_H, May 05 '17 at 08:18
Also, keep this in mind: lots of examples are a simple calculation that can be implemented using some sort of accumulator to pass some intermediate result from the current call to the next one. However, in some algorithms you won't be able to get the result until you've recursed to the base case, where recursion stops. Modeling these using iterative loops requires the implementation of a call stack analogue in the code itself. This too can run out of memory. A call stack is just another form of memory use. — G_H, May 05 '17 at 08:35

In functional languages, how is the compiler able to translate non-tail recursion into loops to avoid stack overflows (if at all)?

3 Answers3

Tail Call Optimization:

Continuation Passing Style - make tail calls out of everything