0

I am new to C and C++. I understand that whenever a function is called, its variables get memory allocated on the stack, that includes the case where the variable happens to be a pointer that points to data allocated on the heap via malloc or new (but I heard it is not guaranteed that the storage allocated by malloc is 100% on the Heap, please correct me if I am wrong). For example,

Void fn(){
    Member *p = new Member()
}
 

Or

Void fn() {
        int *p = (int*) malloc( sizeof(int) * 10 );         
}

Please correct if I am wrong, in both cases, variable p (which holds the address to the object allocated on the heap) is on the stack, and it points to the object on the heap. So is it correct to say that all the variables we declare are on the stack even though they might point to something on the heap? Let’s say the address of local variable pointer p is loaded at memory address 001, it has the address of the member object located on Heap, and that address is 002. We can draw a diagram like this. enter image description here

If that is correct, my next question is, can we have a pointer that is actually located on the heap, and it points to a variable located on Stack? If it is not possible, can that pointer points to a variable located on Heap? Maybe another way to phrase this question is: in order to access something in heap, we can only access it via pointers on the stack?? A possible diagram could look like this

If that is possible, Can I have an example here?

enter image description here

Joji
  • 4,703
  • 7
  • 41
  • 86
  • 2
    C++ is specified to not need stacks or heaps. On systems using stacks and heaps an automatic variable will be in automatic storage and the automatic storage will be a stack unless the implementer is insane or solving a very interesting problem. If it is in memory at all. – user4581301 Jan 16 '22 at 07:32
  • @user4581301 And the same is true for C as well. (Both are tagged...) – user17732522 Jan 16 '22 at 07:33
  • Hi can I ask what do you mean by C++ is specified to not need stacks or heaps? So the memory space is not even divided into the stack and the heap? @user4581301 – Joji Jan 16 '22 at 07:33
  • 3
    @Joji -- There is no mention in the C++ standard of where variables are placed. They could be in registers. – PaulMcKenzie Jan 16 '22 at 07:36
  • Some good reading: https://en.cppreference.com/w/cpp/language/storage_duration ([C version very similar](https://en.cppreference.com/w/c/language/storage_duration)) and [Why are the terms "automatic" and "dynamic" preferred over the terms "stack" and "heap" in C++ memory management?](https://stackoverflow.com/questions/9181782/why-are-the-terms-automatic-and-dynamic-preferred-over-the-terms-stack-and) – user4581301 Jan 16 '22 at 07:36
  • @Joji - The standard calls it "automatic storage" and "dynamic storage", but doesn't specify how that is to be implemented. Using a stack and a heap is one obvious choice, but the language also allows for non-obvious implementations on odd hardware. Other than that, you seem to have understood how it works. – BoP Jan 16 '22 at 07:36
  • In general C and C++ are specified to be as hardware agnostic as reasonably possible. Think of code as a description of behaviour, not operations to be executed. The compiler will take the coded description and produce the appropriate instructions for the given hardware, and the more time you let it spend optimizing the less the output will likely look like the code as the compiler juggles stuff around for optimal size, speed, resource consumption, or whatever optimization parameters you specify. – user4581301 Jan 16 '22 at 07:46

4 Answers4

4

Yes, you can put your pointer on the free store (heap) and have it point to a variable on the stack. The trick is to create a pointer to a pointer (int**):

int main()
{
    int i = 0; // int on the stack

    int** ip = new int*; // create an int* (int pointer) on the free store (heap)

    // ip (the int**) is still on the stack

    *ip = &i;
    // Now your free store (heap) located pointer points
    // to your stack based variable i

    delete ip; // clean up
}

NOTE: The terms "heap" and "stack" are general, well understood, computing terms. In C++ they are referred to in the Standard as the "free store" and (although not directly named) a "stack" is 100% implied (eg. through references to "stack-unwinding") and therefore required.

Galik
  • 47,303
  • 4
  • 80
  • 117
  • Can I ask if my understand for "local variables including pointers have to be the stack first" and my diagram A are correct or not? – Joji Jan 16 '22 at 07:49
  • @Joji Your diagram `A` is correct for your code, but your code is not the only way to create pointers. They do not have to have a corresponding variable on the stack - it could be on the heap, or in the static memory allocation area. There is no restriction as to where the variable holding the pointer must be stored. – Galik Jan 16 '22 at 07:52
  • "In C++ they are referred to in the Standard as the "free store"": The standard doesn't really use that term. It seems to appear only once in reference to the C standard and once as a section title, which is actually about `operator new`/`operator delete`. – user17732522 Jan 16 '22 at 07:55
  • @Galik Can I ask if it is correct and fair to say that in order to access something in heap, we can only access it via pointers on the **stack**? – Joji Jan 16 '22 at 07:57
  • @Joji No. You must access it from a pointer (or potentially a reference!), but that pointer can be anywhere. – Galik Jan 16 '22 at 08:00
  • @user The standard explains the Free Store in terms of new and delete. They didn't have to give it a name but they did. Personally I am happier using the more universal term "heap" which describes exactly the same thing. – Galik Jan 16 '22 at 08:08
2

stack and heap are not specifically defined by the standard. Those are implementation details.

Heap refers to a data structure that many operating systems use to help them safely manage the allocated space for different programs running at the same time. Read more here

Here is a diagram for a simple heap so that you can have a mental model of it: enter image description here

Keep in mind that this is not exactly what operating systems use. In fact, operating systems use a far more advanced form of the heap data structure that allows them to perform many sorts of complex memory-related tasks. Also, not every OS implements the free store using the heap data structure. Some may use different techniques.

Whereas a stack is much simpler:

enter image description here

can we have a pointer that is actually located on the heap, and it points to a variable located on Stack?

Yes, it's possible but rarely needed:

#include <iostream>

int main( )
{
    int a_variable_on_stack { 5 };
    int** ptr_on_stack { new int*( &a_variable_on_stack ) };

    std::cout << "address of `a_variable_on_stack`: " << &a_variable_on_stack << '\n'
              << "address of ptr on the heap: " << ptr_on_stack << '\n'
              << "value of ptr on the heap: " << *ptr_on_stack << '\n';

    std::cin.get( );
}

Possible output:

address of `a_variable_on_stack`: 0x47eb5ffd2c
address of ptr on the heap: 0x1de33cc3810
value of ptr on the heap: 0x47eb5ffd2c

Notice how the address of a_variable_on_stack and value of ptr stored on heap are both 0x47eb5ffd2c. In other words, a pointer on the heap is holding the address of a variable that is on the stack.

digito_evo
  • 3,216
  • 2
  • 14
  • 42
  • Can I ask if the Diagram A is correct? and is it correct and fair to say that in order to access something in heap, we can only access it via pointers on the **stack**? – Joji Jan 16 '22 at 07:58
  • @Joji Your diagram is correct. And your guess is right. – digito_evo Jan 16 '22 at 07:58
  • @Joji wrote, *"is it correct and fair to say that in order to access something in heap, we can only access it via pointers on the stack?"* No, that is not correct. We often have pointers in the heap that point to other objects in the heap. A couple examples are linked lists, and binary search trees. – user3386109 Jan 16 '22 at 08:30
  • @user3386109 So how do you get access to that first pointer that points to all other pointers stored on the heap?? You can't have access to them if that single pointer is lost. – digito_evo Jan 16 '22 at 08:32
  • You are confusing the memory allocation term "heap" with the data structure called "heap". They are two completely unrelated terms. There is no reason to expect the "heap" (random memory store) to be implemented as a "heap" (data structure). – Galik Jan 16 '22 at 08:34
  • @Galik Yes right. I added a disclaimer for that. – digito_evo Jan 16 '22 at 08:44
  • Also worth noting that modern memory management is complex. Too complex for the simple Stack-then-Heap layout in the diagrams. A quick example is a multithreaded program where there are multiple stacks. – user4581301 Jan 16 '22 at 22:32
1

I am new to C and C++.

Your question is not C or C++ specific, but it is about programming languages in general.

... whenever a function is called, its variables get memory allocated on the stack ...

This is correct: Nearly all compilers do it this way.

However, there are exceptions - for example on SPARC or TriCore CPUs, which have a special feature...

... allocated on the heap via malloc ...

malloc never allocates memory on the stack but on the heap.

... is not guaranteed that the storage allocated by malloc is 100% on the heap ...

Unlike the word "stack", the meaning of the word "heap" differs a bit from situation to situation.

In some cases, the word "heap" is used to specify a certain memory area that is used by malloc and new.

If there is not enough memory in that memory area, malloc (or new) asks the operating system for memory in a different memory area.

However, other people would also call that memory area "heap".

... in both cases, variable p is on the stack, and it points to the object on the heap.

This is correct.

... can we have a pointer that is actually located on the heap, and it points to a variable located on Stack?

Sure:

int ** allocatedMemory;

void myFunction()
{
    int variableOnStack;
    allocatedMemory = (int **)malloc(sizeof(int *));
    *allocatedMemory = &variableOnStack;
    ...
}

The variable allocatedMemory points to some data on the heap and that data is a pointer to a variable (variableOnStack) on the stack.

However, when the function myFunction() returns, the variable variableOnStack does no longer exist. Let's say the function otherFunction() is called after myFunction():

void otherFunction()
{
    int a;
    int b;
    ...
}

Now we don't know if *allocatedMemory points to a, to b or even the "return address" because we don't know which of the two variables is stored at the same address as variableOnStack.

Bad things may happen if we write to **allocatedMemory now...

In order to access something in heap, we can only access it via pointers on the stack??

... diagram "B" ...

To access some data on the heap, you definitely need some pointer that is not stored on the heap.

This pointer can be:

  • A global or static variable
    In my example above, allocatedMemory is a global variable.
    Global and static variables are neither stored in a completely different memory area (not heap nor stack)
  • A local variable on the stack
  • A local variable in a CPU register
    (I already wrote that local variables are not always stored on the stack)

Theoretically, the situation in diagram "B" is possible: Simply overwrite the variable allocatedMemory by NULL (or another pointer).

However, a program cannot directly access data on the heap.

This means that p* (which is some data on the heap) cannot be accessed any more if there is no more pointer "outside" the heap that points to p*.

Martin Rosenau
  • 17,897
  • 3
  • 19
  • 38
  • 1
    "new never allocate memory on the stack": Compilers are explicitly permitted to provide storage for a `new` by other means than dynamic allocation (e.g. on the stack) as long as the storage duration will not be reduced (e.g. because it can deduce the corresponding `delete`). And then there is the as-if rule as well. – user17732522 Jan 16 '22 at 11:40
  • @user17732522 is there a concrete example where we have a `new` or malloc to allocate the storage but it turns out the storage is not on heap via some optimization of the compiler? – Joji Jan 17 '22 at 00:50
  • @Joji https://godbolt.org/z/9b4E8cW5v: Without optimizations enabled the `int` for `new int` is dynamically allocated with the `operator new`/`operator delete` replacement I specified, which prints some debugging information. But with optimizations enabled, there is no allocation and no output. You can also look at the assembly to see that there is no call to `malloc` or such. – user17732522 Jan 17 '22 at 04:30
  • In this example the allocation is completely elided, not just replaced with a stack allocation. Getting an example for the latter is more difficult. I know that it is done for co-routine frame allocations, but that is not easy to explain. I am not sure whether compilers currently otherwise make use of this aside from eliding allocations completely. – user17732522 Jan 17 '22 at 04:31
  • @user17732522 I just tried with GCC 9.3.0 under Ubuntu. I tried the optimization settings `-O3`, `-O3 -Ofast`, `-O3 -Os`, `-O9`, `-O9 -Ofast`: The sequence `int * a = new int[10]; a[5] = 6; delete[] a;` always resulted in a call to functions in `libstdc++` that call `malloc()` and `free()`. – Martin Rosenau Jan 17 '22 at 07:09
  • @MartinRosenau GCC seems to implement this optimization only since GCC 11: https://godbolt.org/z/93zzvh6Tx – user17732522 Jan 17 '22 at 07:13
  • @user17732522 What example are you talking about? The example after "Sure:"? Depending on what is written at `...` (in the line after `*allocatedMemory = &variableOnStack;`), it is not possible to optimize that example at all. Let's say the next two lines are `mySecondFunction(); printf("%d\n", variableOnStack);`. If `mySecondFunction()` is in another C/C++ file, the compiler must assume that that function contains the line `**allocatedMemory = value;`. And the memory allocated cannot be `free()`d inside the function because it may be used after the function call! – Martin Rosenau Jan 17 '22 at 07:15
  • @MartinRosenau I think there is a misunderstanding. I was just giving an example unrelated to your answer to Joji's question in the comment. See the compiler explorer link at the beginning of my comment. The second comment was supposed to be a continuation of the previous one. I could have formatted that better. – user17732522 Jan 17 '22 at 07:17
1

In short:

Variables declared within a function are allocated on the stack, and can point to whatever you want (to address of other variables on the stack and to address of other variables on the heap).

Same is for variables declared on the heap. They can point to address of other variables on the heap or to address of variables on the stack. There is no limitation here.

However, variables declared on the stack, are by nature temporary, and when function return this memory is reclaimed. Therefor it is not a good practice to have pointers to variable's address at the stack, unless you know the function did not finish yet (i.e. using local variables address from within the same function or by functions calls from within the same function). A common mistake of novice C/C++ developers, is to return from function, address of variable declared on the stack. When function returns, this memory is reclaimed and will be soon reused for other function calls memory, so accessing this address has undefined behavior.

Eliyahu Machluf
  • 1,251
  • 8
  • 17