5

I can't think of how to title this effectively, which might be why I cannot find a good search for this. In understanding the heap vs. stack during the Rust programming book I read:

The types covered in the “Data Types” section are all stored on the stack and popped off the stack when their scope is over.

I read the more in-depth distinction, but I'm still curious how a stack is used when variables can be defined and used hundreds of lines later. Take for example:

let x = 5;
let y = 6;
println!("x = {}", x); // cannot just pop x off here as y was pushed last?

To clarify, is a stack being used for scope, but inside each scope there also has to be a heap to know where to look during runtime?

Does this mean that the stack is allocated on the heap as well? Or does the compiler keep these two completely separated?

Sorry if this is morphing into a question about compilers and memory management in general.

fn main() {                  // if this is all a stack
    let one = 1;             // stack.push(one)
    let two = 2;             // stack.push(two) 

    let three = one + two;   // stack.push(stack.pop() + stack.pop())???
}

Does this make sense? I'm coming from Python so bear with me.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Tony
  • 1,318
  • 1
  • 14
  • 36
  • There's zero heap allocation in your last code block. Both values are declared on the stack. – Shepmaster Mar 28 '18 at 00:15
  • Im super confused then how the compiler knows what the value of y or x is because it cannot just pop off the last variable confidently? Wouldn't it still have to use a lookup table then at some point? – Tony Mar 28 '18 at 00:20
  • In my answer, I state that *all* local variables inside of a function are popped all at once at exactly one time: the end of the function. Individual variables are *never* popped onto or off of the stack. Could you suggest a way of wording that more clearly? – Shepmaster Mar 28 '18 at 00:29
  • Im lacking any skills in this area to even articulate my question, hopefully that example helps. Ultimately if everything is a stack, then how do you know what a variable's value is that is deep in the stack. – Tony Mar 28 '18 at 00:38

1 Answers1

9

That quote is playing a little fast and loose with the truth because it's only intended as a first estimation. A different quote from the same page is more accurate (emphasis mine):

When our code calls a function, the values passed into the function (including, potentially, pointers to data on the heap) and the function’s local variables get pushed onto the stack. When the function is over, those values get popped off the stack.

The "stack" (data structure) you may be familiar with is related to but quite different from the "stack" (function-calling) that is referred to here.

In a stack (data structure) you only access the value at the top of the stack. With the stack (function-calling), you reserve/return space from the top but you can access any memory in the stack. This could be any of the variables in the same function or even the variables from the calling function. The hardware doesn't care about these details; it is up to the language to ensure that items in the stack are only accessed when it is valid to do so.

It's important to recognize that the values of the variables are not being pushed onto the stack; the stack only corresponds to memory to store the variables.

The modification to the size of the stack only happens at function entry and exit. It's also done as one giant block; the stack pointer is incremented and decremented by the total size of all the function's local variables, regardless of their scope. Individual variables are never popped onto or off of the stack.

Rust's scoping rules prevent you from using the values that have been moved, but in actuality they are still sitting there on the stack. You can access them using unsafe code (but you shouldn't):

struct NonCopyNumber(i32);

fn example() {
    // We allocate space on the stack for all local variable
    // when we enter the function. There's 4 in this example.

    let value1 = NonCopyNumber(1); 

    let raw_ref1 = &value1.0 as *const i32;
    let raw_ref2;

    {
        let value2 = NonCopyNumber(2); 
        raw_ref2 = &value2.0 as *const i32;
    }

    drop(value1);

    // println!("{}", value1.0); // use of moved value: `value1.0`
    // println!("{}", value2.0); // Can't find value

    // Not really safe; assumes details about stack management.
    unsafe {
        println!("{}", *raw_ref1); // 1
        println!("{}", *raw_ref2); // 2
    }

    // Stack is popped
}

do inner set of braces create a separate stack frame

No, stack frames are only created at function entry

stack frame (aka scope)

Stack frames are not the same as scopes. Entering a function creates a new scope and introduces a stack frame. A pair of curly braces creates a new scope but does not introduce a stack frame .

does value2 gets popped upon reaching the closing inner brace?

No, stack frames are only popped at function exit.


Let's use this concrete example:

fn main() {
    let one: i32 = 1;
    let two: i32 = 2; 

    let three: i32 = one + two;
}

I'll annotate the stack with the stack pointer variable %sp. The stack pointer represents the current top of the stack.

fn main() {
    // This function has 3 `i32` variables. 
    // Each is 4 bytes so the function requires 12 
    // bytes of stack space total.

    let one;   // Stored in memory at %sp + 0 bytes
    let two;   // Stored in memory at %sp + 4 bytes
    let three; // Stored in memory at %sp + 8 bytes

    %sp += 12; // Increment / push the stack

    one = 1;
    two = 2; 

    three = one + two;

    // Done with all our variables.

    %sp -= 12; // decrement / pop the stack 
}

Space for all variables is reserved at the beginning of the function, then all the space is returned to the stack at the end of the function, all at once in both cases.

It may be worth noting that we could have added lots of extra braces and still arrived at the same stack-annotated result:

fn main() {
    let one: i32 = 1;
    {
        let two: i32 = 2;
        {
            let three: i32 = one + two;
        }
    }
}

It is only the semantics of the language that prevent us from using a variable after it goes out of scope or before it has been initialized.


Does this mean that the stack is allocated on the heap as well?

This is a very difficult question to answer succinctly. In the computer 1, you only have one chunk of memory: those RAM chips plugged into your motherboard. All of your program's data is ultimately stored here, whether it's "the stack" or "the heap". The big difference between the two is their access patterns.

As shown above, the stack is very lightweight and performant but inflexible. You can reserve more space in the stack (calling a function) and return that space to the stack (leave the function) and that's about it.

The heap is more flexible, but is slower, more complicated and requires additional bookkeeping.

Generally, the stack and heap grow toward each other in memory. One starts at zero and grows upward, the other starts at MAX and grows downward.

Then you get into details like threads. These carve off a piece of memory, perhaps from the heap, and then use that memory for their own stack. So, technically, yes: sometimes a stack is inside a heap.

Making it more complicated, there can actually be multiple heaps, each managing memory according to their own set of rules.

1 Not all computers, but the vast majority that people program for day-to-day. There's also different levels of memory for caches and whatnot, but those don't come into play here.


println!("x = {}", x); // cannot just pop x off here as y was pushed last?

Using a variable in println doesn't move it, so even if your original premise were true, it wouldn't come into play here. See Does println! borrow or own the variable?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • Thanks, thats a helpful example for understanding the stack. I think I was confusing how the compiler knows where to find the variables versus the stack as just a tool for avoiding scope errors. – Tony Mar 28 '18 at 00:05
  • If you have time, can you see my edit for understanding how the compiler knows what `ref1` and `ref2` are pointing to as they go in and out of scope? – Tony Mar 28 '18 at 00:13
  • 1
    I think it would be worth stating upfront that "stack" in this context is quite different from the *stack data-structure* (related, but different). This seems to be the root of the OP's confusion. – Matthieu M. Mar 28 '18 at 07:23
  • 1
    The documentation says that the stack is popped when the scope is over, but in your answer you imply that happens only when the _function_ scope is over. In the first example, do inner set of braces create a separate stack frame (aka scope) and does `value2` gets popped upon reaching the closing _inner_ brace? – Logan Reed Mar 28 '18 at 13:53
  • @LoganReed could you help me reword the first part of my answer to be more clear, as I thought that already answered your point: "That quote is playing a little fast and loose as a first estimation. A different quote from the same page is more precise". A less-charitable way of saying it is "the quote in the OP is a lie". I don't *imply* that it's when the function is over, I state it, as fact, multiple times. – Shepmaster Mar 28 '18 at 13:56
  • @Shepmaster - No, your answer is perfect. It is even emphasized in bold. I just got lost in the dark forest of thoughts in my head :-) (and I read your answer twice! before asking the question). P.S.: Unsafe code can probably reach values in the popped stack as well -- so I kept thinking about that too, which confused me a bit more. – Logan Reed Mar 28 '18 at 13:59
  • "The modification to the size of the stack only happens at function entry and exit." I've been wondering: has it ever been considered to arrange variables on the stack according the ends of their "actual" scopes and to pop each one once it is no longer needed? This might save some memory in some deeply recursive calls. – Alexey Apr 15 '20 at 08:52
  • @Alexey not exactly, as far as I know. I think the keywords that you are interested in will be "guaranteed tail recursion" / "tail call optimization". See [When is tail recursion guaranteed in Rust?](https://stackoverflow.com/q/59257543/155423) for a start. I'm pretty sure that modifying the stack size is a part of the ABI of a function, so it might not be easily doable. You can do it *yourself* by declaring some mutable values in an outer function and passing references to them to an inner recursive function. This is a common pattern anyway. – Shepmaster Apr 15 '20 at 13:40