7

I have a recursive function which can be written like:

void func(TypeName *dataStructure, LL_Node **accumulator) {
    func(datastructure->left, accumulator);
    func(datastructure->right, accumulator);

    {
        char buffer[1000];
        // do some stuff
    }

    return;
}        

I know that in reality the buffer is being allocated at the beginning of the function and putting the statement in a nested scope block doesn't actually use a new stack frame. But I don't want the compiler to allocate an exponential number of 1000-byte buffers at once, when they can be allocated and thrown away one at a time as each level returns.

Should I use outside global variables? A call to a helper function to force the buffer to be allocated after the recursive call? What I'm really fishing for here is advice on the cleanest, most C-idiomatic way of forcing this behavior.

Edit: One add-on question. If the exact same accumulator will be passed to every call of func, is it unheard of to leave the accumulator pointer in a global variable rather than pushing it onto the stack with every call?

Community
  • 1
  • 1
Rag
  • 6,405
  • 2
  • 31
  • 38

5 Answers5

4

Since it's only used by one call at a time, you can just preallocate it and pass it into all the calls via an operand:

void func(TypeName *dataStructure, LL_Node **accumulator, char *buffer) {
    func(datastructure->left, accumulator, buffer);
    func(datastructure->right, accumulator, buffer);

    {
        // do some stuff
    }

    return;
}  
Mysticial
  • 464,885
  • 45
  • 335
  • 332
  • OK that works, and leads perfectly into the edit I just added to my original question. Is it bad practice to pass the exact same buffer pointer through every level of recursion when you can leave the pointer in a global variable? – Rag Sep 27 '11 at 00:41
  • Actually, using globals is not a good idea. (especially if you have multiple threads) So passing in the buffer is the preferred method. – Mysticial Sep 27 '11 at 00:43
  • 4
    To add to Mystical's solution, if your `func` is exposed as part of the API of your module/application, then it would probably be preferable to maintain the original signature `void func(TypeName *dataStructure, LL_Node **accumulator)` and within that function declare a local `char buffer[10000]` and delegate to an internal `func_impl(dataStructure, accumulator, buffer)` to hide the implementation detail that is the temporary buffer space. Client code then has a simpler, cleaner API to deal with. – russw_uk Sep 27 '11 at 00:55
3

One option is to break the function into a non-recursive "public" function that sets up the buffer and calls a private recursive function that requires a buffer be passed in:

struct func_buffer {
    char buffer[1000];
};

static 
void func_private(TypeName *dataStructure, LL_Node **accumulator, struct func_buffer* buf)
{
    func_private(datastructure->left, accumulator, buf);
    func_private(datastructure->right, accumulator, buf);

    // do some stuff with *buf

    return;
}


void func(TypeName *dataStructure, LL_Node **accumulator) {
    struct func_buffer buffer;

    func_private( dataStructure, accumulator, &buffer);

    return;
} 

That way the users of the function don't need to be concerned with the details of how the memory used by the recursive part of the function is managed. So you could pretty easily change it to use a global or a dynamically allocated buffer if it became evident that such a change was necessary or would make sense.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
2

You can either pass the reference to the buffer or use global variable.

When you use the reference as in

void func(TypeName *dataStructure, LL_Node **accumulator, char buffer[]) {
    func(datastructure->left, accumulator, buffer);
    func(datastructure->right, accumulator, buffer);

    {
        char buffer[1000];
        // do some stuff
    }

    return;
}   

void main()
{
   char buffer[1000];

   func (structure, accum, buffer);
}

you are passing the reference, just the pointer to the beginning of the the array, so you have to remember its length.

If you choose to use a global variable, you are actually not using stack, but allocating program memory, a shared space where code and data coexists (code is data). Therefore, you are never using a single byte of extra ram in your calls if you do it like this:

char buffer[1000];

void func(TypeName *dataStructure, LL_Node **accumulator) {
        func(datastructure->left, accumulator);
        func(datastructure->right, accumulator);

    {

        // do some stuff
    }

    return;
}   

void main()
{

   func (structure, accum);
}

It is up to you to choose one. The second one pushes less parameters onto the stack on each recursive call, but enlarges the program size. The first one is more elegant to some, but is a bit slower, maybe not even noticeable.

Rag
  • 6,405
  • 2
  • 31
  • 38
Manuel Ferreria
  • 1,216
  • 1
  • 13
  • 23
  • That's the exact same code twice :) And are you saying that if you allocate a 10,000 byte global variable then the program executable itself will be 10k larger in size? A blank space is actually left on disk for it? What about if you put it in main()? – Rag Sep 27 '11 at 00:47
  • 1
    Woops, thanks a lot, got the wrong copy paste. Exactly, when this is translated into ASM, the global variable would be put in the `section .data`, which is a space reserved for variables. If you put in the main, a 10000 bytes buffer would be pushed onto it, increasing your stack size and reducing the max amount of recursion possible for your function. – Manuel Ferreria Sep 27 '11 at 01:07
  • If you think you would need a much larger buffer, the solution would be to allocate it in the heap, using a malloc with the necessary size, and passing the pointer, just like in my first version of the code. – Manuel Ferreria Sep 27 '11 at 01:08
2

I would personally allocate the buffer on the heap in this scenario somewhat like this:

void func(TypeName *dataStructure, LL_Node **accumulator, char *buffer ) {
    bool alloced = false;
    if( buffer == 0 ){
        buffer = (char*) malloc( 1000 );
        alloced = true;
    }
    func(datastructure->left, accumulator, buffer);
    func(datastructure->right, accumulator, buffer);
    {

        // do some stuff
    }
    if( alloced )
        free( buffer );
    return;
}  

If you don't mind C++ syntax, you could have buffer default to equal zero, or you could mangle the function name and add

#define func(a,b) __mangledfunc__(a,b,0)

That seems like it would be the easiest for your application.

Kaslai
  • 2,445
  • 17
  • 17
  • I like the idea of putting the buffer on the heap, but I think you've messed up the implementation. You're mallocing a 1000-byte block of memory for every recursive call when only a single one is needed, exactly what I was trying to avoid. – Rag Sep 27 '11 at 01:52
  • Well, I fixed my small oversight of omitting the additional buffer argument in the recursive calls, but the idea was there originally. If you expose only the macro call so that buffer is initialized with the value of 0 implicitly, then the function checks for that and if it is indeed zero then it mallocs a new buffer and passes it on to the future calls. The future calls get a nonzero value for buffer, so they don't malloc over the buffer and use the original one instead. – Kaslai Sep 27 '11 at 02:30
0

I believe your average compiler can optimize what are known as "tail recursive functions", where basically the last instruction in your function is a recursive call to that function. In that special case, the function will simply reuse the stack frame with every recursion (so all of your stack allocated variables don't get un/re-allocated!) If you can push all of your instructions before the recursive calls, and you have a decent compiler, it should work-- otherwise, I'd just pass it around as a reference variable.

Philip Guin
  • 1,452
  • 15
  • 21