memory layout of multithreaded process in C++

Question

Am a bit confused on how stack and heap are arranged in multithreaded processes:

Each thread has its own private stack.
All threads share the heap
When the program dynamically creates thread (ex: new Thread() in Java), the object is allocated on heap.

so does the heap contain memory for thread object, which means does heap contains stack (belonging to threads)?

I might have to add that there is not "the heap". Depending on the operating system, a process might as well use several heaps. Windows for example offers the HeapCreate API. — Jim Brissom, Sep 05 '10 at 22:44

score 4 · Accepted Answer · edited May 23 '17 at 10:29

Its delibrately vague as we don;t want to constrain the implementers of the threading software.

Each thread has its own private stack.

As each thread executes a set of function independ from each other they need to store return addresses etc thus each needs its own stack.

All threads share the heap

That's the easiest way to implement it. This also means that all the threads share a common chunk of memory so that each thread can communicate with other threads simply by modifying memory.

When the program dynamically creates thread (ex: new Thread() in Java), the object is allocated on heap.

The stack you mention in question 1. We need to reserve memory for it. So we allocate a chunk of the heap give it to the thread and say use this chunk of memory to implement your stack. (Not saying that it does it this way but that is a simple technique for doing it).

so does the heap contain memory for thread object, which means does heap contains stack (belonging to threads) ?

In a single threaded program there is room to implement the stack as chunks of the heap. The concept of stack and heap being separate and growing towards each other is just that; a concept. It is undefined how either are implemented and their is no reason that we can not implement the stack inside the heap. See this question for more information: stack growth direction

score 1 · Answer 2 · edited Jun 20 '20 at 09:12

Think of 'the stack' as a data structure like any other. It could be implemented in any number of ways.

Here is a description of the typical implementation of the stack in C and C++ programs before 2000 or so. Most still do it this way:

There is a contiguous range of memory addresses which are referred to as 'the stack'. Frequently, on systems that had a memory controller (for Intel this means the 80386 and anything newer), the pages of this range of memory addresses are not assigned to physical memory until they are used. Typically this contiguous range of addresses occurred at the end of the address space.

There is a stack pointer that usually starts at the end of the memory region. When a new stack frame is created the stack pointer is decreased by the size of the frame. The CPU has instructions specifically designed for this operation. If a region of memory is accessed that has no physical memory of any kind assigned to it, the OS handles the page fault and finds some memory to assign to the now used page.

All local variables and function parameters that are not passed in registers find their way into a stack frame.

For multithreaded programs, this scheme doesn't work, so you typically allocate a region of memory using malloc or new and start a new thread with a call that takes a pointer to that region of memory and its size. If the new thread needs more stack space than you've allocated all kinds of horrible things can occur, including the thread just stomping over some random memory that includes other variables allocated 'on the heap'.

But, that is by far not the only way to implement a stack. You could, for example, implement a stack as a linked list with each node of the list being a stack frame. Languages that support a construct called 'continuations' frequently do this. In fact, they usually use a DAG as a single stack frame may spawn multiple other stack frames that are all valid simultaneously.

Another thing that could be done is something halfway between in which your nodes are simply large regions of memory that each contain several stack frames. When a new frame is created that would overrun the node another node is allocated under the covers.

Or, all local variables could be allocated with new or something like that and just destroyed when they went out of scope. The compiler could make this happen behind the scenes.

So, worrying about exactly where your stack is or how the memory is allocated underneath the hood, especially in a language like Java that doesn't even have pointers in the C or C++ sense, is kind of silly. It might even vary between different fully compliant JVMs.

I will say that generally pthreads in C++ implements the stack in the manner I describe for multithreaded programs in the last paragraph of the section in which I describe how C and C++ have historically worked. They usually also have a 'guard page', which a purposely unmapped page at the beginning of the region allocated for the stack so that programs that run out of stack space will usually SEGV. (Actually, this apparently is an oversimplification to the point of being wrong, see Ben Voigt's comment for the real use of the guard page).

The guard page is a little more complicated than that. Initially there are a large area of address space allocated, but just two pages of memory actually committed. The first is where the program begins executing, the second is flagged with the guard attribute. When the guard page is accessed, the guard signal/exception handler runs and commits the next page (and flags it guard) and the process repeats until the reserved memory range is exhausted, at which point a stack overflow handler is executed. — Ben Voigt, Sep 06 '10 at 00:39
@Ben Voigt: Then I misunderstood. Oops. Interesting, so a way to implement the behavior of the main thread stack for sub-threads so you can allocate stacks big enough for the worst case, but still have them small for the common case. — Omnifarious, Sep 06 '10 at 01:08

score 0 · Answer 3 · answered Sep 05 '10 at 22:48

0

Each stack is made on heap, only small amount of kernel is running from true "the one" stack.

answered Sep 05 '10 at 22:48

Luka Rahne

10,336
3
34
56

1

Likely not true. Thread stack is usually gotten using `VirtualAlloc` on Windows, `mmap` on Unix and Unix-like systems. – Ben Voigt Sep 06 '10 at 00:35

memory layout of multithreaded process in C++

3 Answers3