3

C is pass-by-value which means that all the parameters are copied to the stack frame every time you call a function. C also does not support inner functions that could access/modify local variables in the lexically enclosing function, although GNU C does as an extension. The only way to 'pass by reference' in C is to pass a pointer. While there are reference types in C++ these are just pointers behind the scenes.

I thought about the following scenario in a hypothetical low-level C-like language. There could be a function declared inside another function only visible to the outer function and will only be called by the outer function. If my understanding is correct the stack pointer is incremented by a fixed amount when the inner function is called assuming no funny business such as variable-sized objects. Therefore as the thought process goes any local variables declared in the outer function would be at fixed offsets relative to the stack pointer even inside the inner function. They could in theory be accessed in the same way as other local variables without needing to pass a pointer. And of course trying to access and call the inner function from outside the the outer function would try to access out-of-lifetime local variables in the outer function and result in undefined behavior.

Here is an illustration: (again this not C but an imaginary C-like language)

void (*address_to_inner)(void);
void outer(void) {
    int a = 10;
    void inner(void) {
        printf("%d\n", a);
    }
    inner(); // Prints 10
    address_to_inner = inner; // Not undefined behavior yet but not good
}
void another(void) {
    outer(); // Prints 10
    address_to_inner(); // Undefined behavior because inner() tries to access `a` after it was deallocated
}

Does my thought process apply on typical architectures such as x86 or ARM? Some features such as automatic memory management are not supported in low-level languages such as C because they are not natively supported by architectures and would require too much complex behind-the-scenes work to be feasible for a low-level language. But is my scenario 'feasible' for a low-level language or are there any architectural or technical limitations that I did not consider that would make implementing this non-trivial based on how the stack/local variables and function calls work? If not why not?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
user16217248
  • 3,119
  • 19
  • 19
  • 37
  • Hm... in the case of JS closures at least, inner variables are no longer accessible when returned. Have you seen [this question](https://stackoverflow.com/q/18051012/) by any chance? – General Grievance Apr 18 '23 at 18:25
  • @GeneralGrievance Interesting. The languages in which I am familiar such as Objective-C or Swift seem to handle closures the way I described in the comment. In JS what happens if I return a closure that accesses a local variable and call it? Edit: It seems to work as intended in JS. – user16217248 Apr 19 '23 at 00:03
  • @GeneralGrievance I mean this is *kind of* like a closure, but not really. – user16217248 Apr 19 '23 at 03:25
  • 2
    GNU C nested functions get interesting when you take the address of the function and pass that function-pointer to something else. Then it has to make a trampoline of machine code (on the stack) and pass a pointer to that, which will set up a static chain pointer to the nested function can find its parent's stack frame even though it's *not* directly a child of it. See [Implementation of nested functions](https://stackoverflow.com/q/8179521). When it is called from its parent, IIRC it just inlines, even with optimization disabled. – Peter Cordes Apr 19 '23 at 04:18
  • @PeterCordes So GNU C goes as far as to make it work, instead of being undefined behavior. – user16217248 Apr 19 '23 at 15:00
  • 1
    The parent function local vars must still be alive during the call to the nested function via a pointer. The function pointer ceases to be valid when the parent function returns. So the caller still has to be a child of the parent in the call-tree, it just doesn't have to be a *direct* child. Your argument about var lifetimes is correct, and GCC doesn't do anything to work around it. – Peter Cordes Apr 19 '23 at 15:21
  • @PeterCordes So that aspect is still undefined behavior. But it works around the limitations of passing it as a pointer to another function which would scramble the offsets relative to the local vars in the outer function. – user16217248 Apr 19 '23 at 15:24

1 Answers1

2

Therefore as the thought process goes any local variables declared in the outer function would be at fixed offsets relative to the stack pointer even inside the inner function.

This would work, though you would have to preclude function calling among nested function, which also precludes recursion among them.  Let's note that Pascal nested functions can call each other, and recursively as well.

Pascal's stack-based static link mechanism was appropriate for a time when machines had few CPU registers, so most any Pascal program variables were mapped to memory.

Another possibility, suitable for a machine with many CPU registers (x64, RISC V, ARM), might be to do register allocation of local variables both in and across functions and their nested functions.  Thus, just as local variables may live in registers, non-local variables might also.  This would also preclude recursion, but would make a performance difference.

Lastly, let's mention inlining as that accomplishes a lot of the efficiency we expect from machines with large numbers of CPU registers, by avoiding use of memory for local variables — once two functions are merged by inlining, they share the same local variable storage space; inlining can even remove the pair of address taken by caller and then dereference within callee, allowing to map such a local variable to a register, instead of memory as would be needed if the address actually had to be materialized.  As inlining is done by analysis of the functions by the compiler, it is flexible in allow one function to call another, including recursion — though certain constructs may disable some optimization.

Erik Eidt
  • 23,049
  • 2
  • 29
  • 53