Recursion function stops working before recursion limit. Memoized, works up to a point (2400), but then finishes with no result

Question

My function is just a simple recursive add, with a cache instruction to prevent it from going through the recursion more than it needs to. So input 4, we get 1+2+3+4 = 10. It is also cached, so that it doesn't go through the function more than it needs to and overload my low powered laptop.

If I input 2200, I get the correct result in about .2 secs. But any time I put in a number above 2417 or so (sometimes this number is higher or lower), it gives me no result, and "finished in 2.3s." There is no error, because I have the recursion limit set higher than that. I just get... no answer.

Please note that this is not a question about the value of recursion and where it is or isn't appropriate in certain situations. I posted a similar question and the only answer I got was "why are you even using recursion?" It is because I am new and just trying to learn ;)! But in playing with the limits of what my code could do, I found it odd that I couldn't run it with a higher integer than 2400, although I got no error.

Here is my code:

recur_cache= {}
def recur_add(n): 
    if n in recur_cache: 
        value = recur_cache[n] 
    if n == 1: 
        value = 1
    if n == 0: 
        value = 0
    else: 
        value = n + recur_add(n-1)


    recur_cache[n] = value
    return value

Thanks!

This is a great example of why not to use recursion. Memoization can't help here because you need to expand the stack to `n` frames before anything is even in the lookup dict. If you add a print to the top branch where something was found in the memo, you'll see it's dead code unless you call the function multiple times. — ggorlen, Dec 22 '20 at 23:09
Also, I think you're misunderstanding the recursion limit knob in Python. You're effectively messing with an internal, environment-specific knob, so pushing it up just risks crashing the interpreter, it's not free lunch. The docs say: "The highest possible limit is platform-dependent. A user may need to set the limit higher when they have a program that requires deep recursion and a platform that supports a higher limit. This should be done with care, because a too-high limit can lead to a crash." The "crash" here refers to the CPython _interpreter_ crashing, hence "I get... no answer". — ggorlen, Dec 22 '20 at 23:17
You asked this same-exact question before: https://stackoverflow.com/questions/65414323/ive-made-a-very-simple-recursion-funciton-with-a-cached-dictionary-recursion-l and didn't even respond to any of the comments/questions. You never answered how you are running this. Because as people noted, you are almost certainly going to run into a segfault or stack overflow. Which could look like "although I got no error", because the entire process crashes — juanpa.arrivillaga, Dec 22 '20 at 23:26

ggorlen · Answer 1 · 2020-12-23T04:07:04.750

Memory limits and recursion

Your similar question on this topic says

There is no error, because I have the recursion limit set to 100000000, just... no result.

It doesn't seem like it would be outside of the computer's processing power to do this, so why am I getting this limit?

First of all, the CPython interpreter is a process running on an operating system. The operating system determines how much memory the process gets to work with and if it exceeds this limit, the OS terminates the application.

It probably goes without saying, but the OS is running on real hardware with physical limits, and the OS's job is to control how processes interact with physical resources. A process can't say "I need [insert arbitrarily huge number here] pages of memory" and expect that request to be granted by the OS or fit within the physical limits imposed by the hardware.

When you type python3 my_program.py and hit Enter the python3 process is loaded and gets memory from the OS. This process parses your source code, compiles your program into bytecode if it's syntactically correct and interprets the bytecode. As your program executes, the process has some finite amount of space to work with, both to store its own data and data associated with your program (objects you allocate, resources you open, functions you call, etc).

When your application makes a function call, the interpreter needs to have space for the stack frame. This space isn't imaginary or theoretical. It's real memory that the interpreter has to get from the OS. If the call spawns other calls, the first call's memory remains tied up because it will need it to finish its instructions when it regains control after child calls resolve. As such, the interpreter is using more of its memory per call and there's no way to free this memory other than to stop the recursion at some point and begin deallocating stack frames.

CPython has a sensible recursion limit built in to create a safety net. This is exposed to the programmer as sys.getrecursionlimit and sys.setrecursionlimit. If you have deep recursion, two scenarios are possible. The typical case is your stack exceeds the artificial memory limit imposed by the CPython interpreter and the interpreter gently raises a RecursionError and lets you handle it (typically by re-writing the program to fix infinite recursion bugs).

The second possibility is that you tell the CPython interpreter "I know what I'm doing and I want you to ignore the sensible limit you've imposed on my program". By doing sys.setrecursionlimit(100000000), you're essentially tossing out the safety net and risking crashing the entire interpreter process which is a serious security risk. As the docs state:

The highest possible limit is platform-dependent. A user may need to set the limit higher when they have a program that requires deep recursion and a platform that supports a higher limit. This should be done with care, because a too-high limit can lead to a crash.

The crash here is not just some CPython exception event the interpreter creates as a normal part of interpreting bytecode that spawned too many calls; it's the interpreter process running out of stack memory and the OS terminating the entire process that's running your program.

I just get... no answer.

Yep, that's what happens when the OS kills a process, you're back at the console with an exit status code. The OS doesn't give the interpreter the luxury to dump a stack trace or message.

You certainly can mess with the recursion limit and interpreter stack size but doing so is basically a hack--you can come up with a large enough number that exceeds the stack size and you're back to where you started holding a fundamentally flawed approach in your hands. For some applications, you may find that increasing the stack buys you enough space to achieve what you need for getting a quick and dirty non-production task done, but beware that whatever knobs you messed with might not be portable to other environments.

Memoization

Memoization is a caching technique that avoids doing repeated work. If your computationally-intensive function is ever asked to produce the result of a calculation it's previously done, it can look it up and return it.

But memoization isn't anything beyond this and it won't stop a stack overflow as you may think. For your first call, the table is empty and so the entire n stack frames need to be allocated. If you add a print in your first branch that fires when the memo table contains a previous computation, you'll see it never fires. The dict lookups can only help with subsequent calls to the top-level function in this case.

Use loops instead of recursion

Recursion should not be applied to most problems, and summing a series of numbers is a prime example of a problem that is best solved iteratively.

Some programming languages support tail call optimization that convert recursion into loops behind the scenes, enabling recursive idioms to be applied without ill-effects, but Python is not one of these languages and probably never will be.

Drawbacks associated with recursion include:

You can blow the call stack.
Calling functions is expensive in space and time.
Recursion is often used as a crutch for problems it's poorly-equipped to solve, like this.
Usually the iterative version is just as readable.

You can replace recursion with a loop and a stack quite easily, particularly for linear problems such as the one posed here.

Recursion can occasionally be useful for sorting and tree/graph algorithms, but the typical pattern for these is that the input is split by some constant factor per call, leading to logarithmic growth in the call stack. For example, the repeated division by two in merge sort's recursive step means your list would need to be around 10³⁰⁰ elements just to generate a maximum stack size of 1000.

If your goal is to learn recursion, bear in mind that recursion is not really used much in typical application development and most algorithms that can be expressed recursively are better expressed iteratively. By "better" I mean faster, safer, more readable and maintainable, etc. Nonetheless, recursive thinking is a valuable skill, so the purpose of an algorithms class that teaches recursion isn't as much about practicality as it is about patterns.

Recursion function stops working before recursion limit. Memoized, works up to a point (2400), but then finishes with no result

1 Answers1

Memory limits and recursion

Memoization

Use loops instead of recursion