5

In C as many of you know, the stack is where all local variables reside. The stack being a first in last out data structure means you can only access what has been most recently pushed onto it. So given the following code:

int k = 5;
int j = 3;
short int i;

if (k > j) i = 1;

Obviously this is useless code which has no real meaning but I'm trying to wrap my head around something.

For the short int i declaration I'm assuming 2 bytes get allocated on the stack. For int k and int j for both 4 bytes get allocated with the values 5 and 3. So the stack would look as follows

----------   <- stack pointer
int i
----------
int k = 5
----------
int j = 3
----------

so for the if statement you would have to pop the int i to get to the conditions k and j, and if so where does int i go? This all seems very time consuming and tedious if this is the way that C does local variables.

So is this actually how C does it or am I mucking it all up?

Anthony
  • 93
  • 5
  • actually your layout of the items on the stack is wrong, as i is a short, not an int. and a stack grows downward in memory, not upward and local variables are placed on the stack, in reverse order. Irregardless, any value on the local stack is NOT popped rather a offset from the stack pointer is used, so referring to 'k' results in "read word (some register) from sp[(offset to k)" similarly for the 'j' variable. and 'i' is set as "write halfword sp[offset to i] from (lower half of register that contains result of k+j) in general you can think of the stack as a long array with partitions – user3629249 Nov 13 '14 at 05:34

4 Answers4

8

The stack is not a stack. It's still random access memory, meaning you can access any location in constant time. The only purpose of the stack discipline is to give every function call its own, private memory area that the function can be sure is not used by anyone else.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • Indeed, if you look at the disassembly of a C function, you won't find push and pop for local variables, you'll find a constant offset from the base stack pointer - the variables are at a very predictable place in addition to being in random access memory. The instruction for loading `k` in the OP's example, for example, would look like `mov eax, [ebp-4]` - the stack pointer itself, being variable within a function, is not even actually used for locals at all. – Adam D. Ruppe Nov 12 '14 at 22:46
  • It appears multiple sources disagree with you, here is one of them: http://gribblelab.org/CBootcamp/7_Memory_Stack_vs_Heap.html note "he stack is a "FILO" (first in, last out) data structure, that is managed and optimized by the CPU quite closely." I should point out I am not saying it uses the hardware stack pointer but it is infact a FILO stack – Anthony Nov 12 '14 at 22:47
  • 1
    It might help to look at the generated code run `gcc -c foo.c` then `objdump -d foo.o` (optionally adding `-M intel` to objdump if you're like me and hate the AT&T syntax) and you can see whats' actually generated. Though I guess that doesn't help if you can't read assembly language... but the short of it is that the call instruction pushes a return address to the stack, then the function uses the memory next to that for locals, then returning pops all that stuff off again. So it is kinda like a stack of arrays rather than a pure stack. – Adam D. Ruppe Nov 12 '14 at 22:53
  • Re: looking at compiler output: see [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) and Matt Godbolt's CppCon2017 talk [“What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”](https://youtu.be/bSkpMdDe4g4) which has a basic intro to reading x86 asm. – Peter Cordes Aug 11 '22 at 17:40
6

You're ever-so-slightly mucking it up.

Yes, local (auto) variables are typically stored on a stack. However, they're not popped off the stack when read; they're referenced by an offset from the stack pointer.

Take the following code:

x = y + z;

where each of x, y, and z are allocated on the stack. When the compiler generates the equivalent machine code, it will refer to each variable by an offset from a given register, sort of like:

mov -8(%ebp), %eax   
add -12(%ebp), %eax
mov %eax, -4(%ebp)

On x86 architectures, %ebp is the frame pointer; the stack is broken up into frames, where each frame contains function parameters (if any), the return address (that is, the address of the instruction following the function call), and local variables (if any). On the systems I'm familiar with, the stack grows "downwards" towards 0, and local variables are stored "below" the frame pointer (lower addresses), hence the negative offset. The code above assumes that x is at -4(%ebp), y is at -8(%ebp), and z is at -12(%ebp).

Everything will be popped off the stack1 when the function returns, but not before.

EDIT

Please note that none of this is mandated by the C language definition. The language does not require the use of a runtime stack at all (although a compiler would be a bitch to implement without one). It merely defines the lifetime of auto variables as being from the end of their declaration to the end of their enclosing scope. A stack makes that easy, but it's not required.


1. Well, the stack and frame pointers will be set to new values; the data will remain where it was, but that memory is now available for something else to use.
John Bode
  • 119,563
  • 19
  • 122
  • 198
  • Almost all compilers use a stack to allocate locals. Historically, some compilers targeting 8-bit micros without very usable stack random access had the option to make non-reentrant functions using fixed absolute addresses for locals (static storage but without the only-init-once semantic). And IBM's compiler for their mainframes did or does allocate from the heap (mentioned in [does the automatic local variable are stored in the stack in C?](https://stackoverflow.com/a/11513968), more dedicated googling would probably find more details). – Peter Cordes Aug 11 '22 at 17:44
  • Modern compilers for x86 *do* use the stack, but often they don't use EBP / RBP as a frame pointer. With optimization enabled, the default is not to do that. Having the a fixed reference point makes things easy for humans, but unless you're doing variable-sized allocations (VLA / alloca), a compiler always knows the distance from ESP to any local on the stack. And `12(%esp)` is a valid addressing mode, unlike in 16-bit. [x86\_64 : is stack frame pointer almost useless?](https://stackoverflow.com/q/31417784) - yes – Peter Cordes Aug 11 '22 at 17:47
5

The call stack is a stack, but it's not used as you're imagining. Each time a call is made from a function, the return address (the program counter) is pushed onto the stack, along with the local variables. When each function returns, the stack is popped by what's called one "stack frame", which includes the variables. Within each function, the memory is treated as random access. The compiler, having generated the code that ordered the local variables on the stack, knows exactly what distance they are from stack frame pointer, and so doesn't have to push and pop the individual local variables.

brycem
  • 593
  • 3
  • 9
0

The top of the stack, on an Intel processor, and many others, is referenced by an address stored in a cpu register, call it SP which is copied to the Base Pointer, call it BP; many machine instructions allow an address expression composed of the current BP combined with a byte offset. so, in your example i would be offset 0, j would be offset -2 and k would be offset -6.

the if would simply resolve to a compare of the contents of addresses -6(BP) and -4(BP). the actual offset values may differ from implementation to implementation; but, that's the general idea...

IRocks
  • 89
  • 1
  • 5