While recursing in functional programming (or otherwise), what are the benefits/limitations of working with an explicit state variable?

Question

I was playing around with functional programming in Python, and realized there are two ways to replace loops with recursion. The canonical recursion technique does not seem to need any state, e.g., "factorial_canon" below. An alternative is to use a state variable to store intermediate results, e.g., "factorial_alter" below.

def factorial_canon(value):
    if value == 1:
        return 1
    else:
        return value*factorial_canon(value-1)

def factorial_alter(value, current = 1, state = 1):
    if current <= value:
        return factorial_alter(value, current + 1, state*current)
    else:
        return state

print 'factorial_canon: ', factorial_canon(5)
print 'factorial_alter: ', factorial_alter(5)

Result:

> factorial_canon: 120
> factorial_alter: 120

The canonical approach works well for small problems, but quickly gets complicated in real projects. Are there any drawbacks of the alternative (state-based) approach? Does it violate any conditions of a pure functional implementation, e.g., regarding mutable variables?

I realize that the state-based approach cannot benefit from tail call optimization, but I believe that is irrelevant since Python does not support such optimization anyway.

What you've defined is the factorial function, not the Fibonacci function. — Tagc, Jan 25 '17 at 09:59
Actually you `_alter` could use *tail* recursion whereas you `_canon` cannot. — Willem Van Onsem, Jan 25 '17 at 10:01
Furthermore I'm fairly sure both approaches are equally pure; you're dealing solely with immutable objects and both are technically pure functions in that their outputs are solely dependent on their inputs without (observable) side effects. — Tagc, Jan 25 '17 at 10:02
There's a simpler form of your tail-recursive `_alter` function (which I'll call `fact`, since it computes a factorial, rather than a fibonacci number): `def fact(value, state=1): return state if value <= 0 else fact(value-1, state*value)`. There's no need to pass two state variables. — Blckknght, Jan 25 '17 at 10:10

Will Ness · Answer 1 · 2017-01-25T14:40:53.980

No, it's the other way around: the state-passing paradigm is what makes TCO possible; the simply-recursive "canonical" version is not tail recursive so TCO doesn't apply to it at all.

The essence of tail call optimization is the reuse of stack frame variables (i.e. arguments):

    fact 5:
    fact_state 5 1 1:
    fact_state 5 2 2:
    fact_state 5 3 6:
    fact_state 5 4 24:
    fact_state 5 5 120:
    return 120

The simply recursive version must perform multiplication after the recursive call returns, so must push the current n to the stack, to multiply by it the return value, when the recursive call returns.

The tail recursive version multiplies before the recursive call, thus has no need to do anything after the return, no need to push anything to the stack, and thus can just return the final value to the system (there is only one final return).

This means the recursive tail call replaces the current call; its arguments, stored in the function call frame, get reused -- updated in place (mutated by the compiler); turning the whole calculation, functionally, into a loop. Which is the goal of TCO, after all.

Of course the way to use recursion in Python is to not use recursion in Python. As you say, Python does not provide TCO. Our programs are still run on computers which have program counter, jumps and updatable registers in their instruction set, IOW loops. Maybe when smart dust arrives this will change. Recursion with TCO is transformed into loops, it's the whole reason to have the TCO in the first place.

(see also)

score 2 · Answer 2 · answered Jan 25 '17 at 16:25

2

Are there any drawbacks of the alternative (state-based) approach?

No, but there are drawbacks with the classical recursive version. Imagine you are summing an array of odd numbers and that in the event there are even ones the result shuld be zero. You pass it [1,3,5,6]. In the classical sense you have 1 + 3 + 5 + ? and you don't know the seen values. How can you make the result zero?

There are two ways. One is by accumulator, which you do in factorial_alter. In the base case you return the accumulator, in he event you find an even number you stop recursion and return zero.

A second way is by call-with-current-continuation. You catch the current continuation, which is a function taking the current result as argument and continues the rest of the program. Thus you may at any point in time cancel and choose what the sum is, in my examples case 0. Python does not provide call/cc.

answered Jan 25 '17 at 16:25

Sylwester

47,942
4
47
79

It'd be cool to see implementation of the call/cc example you describe – Mulan Jan 26 '17 at 12:15
@naomik full call/cc is not needed. a bail-out-early-capable left fold is entirely possible with simple "contingency" functions (exit continuations, usually just `(lambda (x) x)`, not full continuations) which would be called in tail position if need to exit early. an example in Haskell can be `foldlWhile` at the bottom of [this page](https://wiki.haskell.org/Foldl_as_foldr_alternative). – Will Ness Jan 26 '17 at 13:17
@WillNess Wouldn't that be the accumulator version? If you were to recurse a tree using `call/cc` as a exit function in the event a subtree is found invalid. How I see it you need either do it in two phases or you need to do it with continuation passing style which defeat the need for `call/cc` but is harder to read. – Sylwester Jan 26 '17 at 13:28
@Sylwester yes, functionally it is the equivalent of the accumulator version. was just thinking about it right now, re-reading your answer. It _is_ similar to CPS but we would just pass the exit contingency-function alone as a parameter and build up the result as a simple value in the accumulator. – Will Ness Jan 26 '17 at 13:29

Willem Van Onsem · Accepted Answer · 2017-01-26T12:33:23.480

Are there any drawbacks of the alternative (state-based) approach?

Well given you feed the function state variables, you interpreter/program will have to push these on the call stack, so as a result the time it takes to make a call will be larger and the call stack will grow faster (given no tail-recursion is used). Even if tail recursion is used, the stack frame will be larger and thus require more time (in case of tail recursion, you overwrite the stackframe; but it will of course take time to overwrite it).

Usually when you call, the call stack looks like:

+----------------+
| return address |
+----------------+
|        a       |
|        b       |
+----------------+

With a and b parameters, now given you make a call without tail recursion if you make a call, it will push, on the stack, so the stack will look like:

                          +----------------+
                          | return address'|
                          +----------------+
                          |        a'      |
                          |        b'      |
+----------------+        +----------------+
| return address |  -->   | return address |
+----------------+        +----------------+
|        a       |        |        a       |
|        b       |        |        b       |
+----------------+        +----------------+

Now if you use tail recrusion, the call stack is not pushed, but its parameters are overridden:

+----------------+        +----------------+
| return address |  -->   | return address | (no new return address)
+----------------+        +----------------+
|        a       |        |        a'      |
|        b       |        |        b'      |
+----------------+        +----------------+

Note however that a and b on the call stack are modified before the jump is made. If you have more parameters (because of state parameters), these will of course all have to update and thus cost time.

Does it violate any conditions of a pure functional implementation, e.g., regarding mutable variables?

No you do not mutate any variables here: you pass a state through parameters, so every variable is set only once since the variables in the call are different from the one you receive.

I realize that the state-based approach cannot benefit from tail call optimization, but I believe that is irrelevant since Python does not support such optimization anyway.

Python indeed does not support tail recursion. It will even be hard to support this since Python is a dynamic programming language and you could thus change the function pointer while you run the function resulting in a different call.

Two final important notes:

your _canon version actually cannot use tail-recursion since the multiplication of the result needs to be done, so the call stack has to unwind iteratively. Your _alter version however could, by some programming languages be implemented with tail-recursion; and
your program does not calculate the Fibonacci sequence, but the factorial.

Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackoverflow.com/rooms/134093/discussion-on-answer-by-willem-van-onsem-while-recursing-in-functional-programmi). — Bhargav Rao, Jan 26 '17 at 15:42

While recursing in functional programming (or otherwise), what are the benefits/limitations of working with an explicit state variable?

3 Answers3