How does using stack instead of recursion give better performance in C when recursion uses stack as well?

Question

This question was prompted by studying the C language.

I have seen in my data structure course that in many cases where recursion proves to be a quick and easy solution (e.g. quicksort, traversal of binary search tree, etc.) it has been explicitly mentioned that using a self-created stack is a better idea.

The reason given is that recursion requires many function calls, and function calls are 'slower'.

But how does using a self-created stack prove any better as any function call makes use of the stack?

Good question, but why do you not ask your tutor? If he cannot provide clear resons, you can get an impression about his knowledge. — too honest for this site, Oct 06 '15 at 13:14
@Olaf Maybe he did? By asking here, though, he can get a broader range of answers and be able to contextualize a response his tutor (if he has one?) gives him much better. Generally, I would encourage you to not use comments to second-guess the existence questions, but rather to initiate improvement and clarification of the question. — Nathaniel Ford, Oct 06 '15 at 18:24
@NathanielFord: Yes, I should have added "too" or so, because by comparing answers here with what the tutor says (combined with some reason) will allow to weight how far one can trust that tutor and how broad his knowledge actually is or how quickly he repeats just what he learned without reflexion. Is requesting a second guess not also some kind of clarification? We are no clairvoyants (well, at least I'm not), so we cannot know **why** the tutor stated that. — too honest for this site, Oct 06 '15 at 18:41
@Olaf I found this in a book that I used. I don't remember if my tutor ever told me about this. In my experience, asking someone about things they never said is generally a bad idea, as it can put you in a situation where you are first required to explain them the concept ;) — Registered User, Oct 06 '15 at 18:57
@RegisteredUser: The text just invoked the impression that. Any, way, you should ask him and compare with the answers. This is a good way to get an impression. And it might be a good idea for lessons. As I wrote in the first comment: Good question! — too honest for this site, Oct 06 '15 at 19:03

score 2 · Answer 1 · answered Oct 06 '15 at 13:18

The self-created stack means that you push just a few of the important variables to the self-created stack.
When you use recursion, the function header and state of all variables will be pushed to the stack.

If the depth of the recursion is high, using case 2 means that the memory will be exhausted pretty quickly.

score 2 · Answer 2 · answered Oct 06 '15 at 13:43

Generally a function call has some overhead before anything inside the function is done. The code generated for a function call basically ensures that you'll find everything like you left it when you, well, return; while it gives you at the same time a clean empty environment inside the called function. In fact this convenience is one of the most crucial services C provides, next to the standard library. (In many other respects C is a mere macro assembler -- did you ever look at a C source and the generated assembler side by side?).

In particular usually a few registers must be saved, and possibly parameters must be copied on the call stack. The effort required depends on the processor, compiler and calling convention. For example, parameters and return values may be in registers, not on the stack (but then the parameters must be saved anyway for each recursive call, don't they?).

The overhead is relatively large if the function is small; that's why inlining can be powerful. Inlining recursive function calls is similar to loop unrolling. I don't know whether current compilers do that on a regular basis (they might). But it's risky to rely on the compiler, so I would avoid recursive implementations of trivial functions, like computing the factorial, if speed is important.

score 2 · Accepted Answer · answered Oct 06 '15 at 18:17

There are two real reasons that self-created stacks can be more efficient than the execution stack:

The execution stack is meant to handle a generalized case of calling new functions. That means it has a lot of overhead: it has to contain pointers to the preceding function, it has to contain pointers to values on the heap, and a number of other bookkeeping items. This may be more than you need for your specific calculation if your calculation is, indeed, specific. All the additional management decreases efficiency. In situations where the function is very heavy and there are relatively few calls, this is fine. In a situation where the function itself is simpler, but there are many function calls, the cost of overhead increases disproportionately.
A generalized stack hides a lot of details from you, preventing you from taking advantage of directly referencing a different part of the stack. For instance, the root of the stack is hidden from you. Lets say you're searching for a particular value in a large tree, using recursion. At some point you're a thousand nodes deep in the tree and you find the value. Success! But then you have to climb out of the tree one function at a time: meaning at least a thousand calls just to return the value. (*) Instead, if you've written your own stack you can return immediately. Or, suppose you have an algorithm that, at certain nodes in the tree, requires you to back up n stack frames before continuing execution. Using the generalized stack frame you are required to back out of those frames until you find the one you're looking for. If you designed the stack specifically for your algorithm, you can provide a mechanism to immediately jump to the point of execution in one instruction rather than n.

Thus, it can behoove you to write your own stack when you can either take advantage of throwing out parts of the generalized stack frame mechanism that you don't need but cost time, or the algorithm being written can take advantage of moving rapidly through the stack if it knows what it is doing (where a generalized stack 'protects' you from doing this by hiding it's abstraction). Remember that function calls are just a particular generalized abstraction for handling code: if for some reason they are adding a constraint that makes your code awkward, you can probably create a stripped down version that more directly addresses your need.

You might also create your own stack if the memory allotted to your stack is small compared to the number of times you must recurse, such as if you have a very large input domain or if you're running on specialized small-footprint hardware or a similar situation. Again, though, it depends on the algorithm you're running and how the generalized stack solution helps or hinders it.

(*) Tail recursion can often help, but because tail recursion is by definition only entering a stack frame one level deeper, I'm assuming you're talking about a situation where that is not strictly possible.

score 0 · Answer 4 · answered Oct 06 '15 at 13:15

Maybe using a self created stack was not necessarily recommended for performance reasons. One good reason I can think of is that the "regular" stack may be of fixed size (often 1MB), so for example sorting large amounts of data would cause a stack overflow.

How does using stack instead of recursion give better performance in C when recursion uses stack as well?

4 Answers4

Linked