-1

As i know, when a C program running, a virtual memory is create with stack segment for local variable, heap for dynamic allocation, text segment for code, data segment for static variable and global variable. I do not understanding why we have to separate our memory into stack, heap, data segment and text segment? Which create virtual memory, the operating system or compiler ? And as my understanding when we run a bare metal embedded program, we will run on physical memory so we will not have stack, heap, data segment here, is this right ?

trincot
  • 317,000
  • 35
  • 244
  • 286
  • This separation is orthogonal to virtual vs physical separation. I.e. these segments might exist in either physical or virtual memory. – Eugene Sh. Jun 24 '20 at 13:55
  • C11 draft standard n1570: *6.2.4 Storage durations of objects* can help answer your question. – EOF Jun 24 '20 at 13:58
  • Many of the previous similar questions seem to direct to [What and where are the stack and heap?](https://stackoverflow.com/questions/79923/what-and-where-are-the-stack-and-heap) but also [Where in memory are my variables stored in C?](https://stackoverflow.com/questions/14588767/where-in-memory-are-my-variables-stored-in-c) is perhaps easier to digest. – Weather Vane Jun 24 '20 at 14:03
  • *`I am starting with C and being confused with memory layout in C`* as for now leave it unanswered. It is too early. Learn language. As a starting point C does not know anything about the memory segments, C does not know anything about the stack and a heap. In is only implementation related things - completely not needed for the beginners. – 0___________ Jun 24 '20 at 14:04
  • From an embedded point out view, this might be more helpful: [What resides in the different memory types of a microcontroller?](https://electronics.stackexchange.com/questions/237740/what-resides-in-the-different-memory-types-of-a-microcontroller) – Lundin Jun 24 '20 at 14:28
  • I know that in the stack segment we will have variable of function and in heap we save the memory which return from malloc(), calloc(),... – Tiến Thành Nguyễn Jun 24 '20 at 14:55
  • If you want to learn about **memory in C**, follow this [link](https://craftofcoding.wordpress.com/2015/12/07/memory-in-c-the-stack-the-heap-and-static/) or do what @EOF suggested. – Shubham Jun 24 '20 at 14:59
  • @Weather Vane, i have read the link you gave but they are different with my question. My question is about why we need to create stack, heap, data,...segments. As my knowledge when we learn about C, in language C they rarely mention about this. So i think in C language we do not have concept about memory segment (heap, stack,...) maybe this is related with OS or compiler. I am not sure, so it would be kind if some one can explain this issue. – Tiến Thành Nguyễn Jun 24 '20 at 15:25
  • The terms *stack* and *heap* etc are not mentioned in the C standard. So you don't ***need*** them, but they are a common way to implement the functionality of C. – Weather Vane Jun 24 '20 at 15:45
  • ...also whether you are running on "bare metal" has nothing to do with it. – Weather Vane Jun 24 '20 at 15:57
  • @Eugene Sh i think these segment (heap, stack...) is only concept of virtual memory and in physical memory (RAM, ROM, flash....) we do not have concept about stack, heap, data, code segments. – Tiến Thành Nguyễn Jun 24 '20 at 16:20
  • @TiếnThànhNguyễn you are quite wrong about that. – Weather Vane Jun 24 '20 at 16:31

3 Answers3

2

Stack, heap, data and text are located in physical memory, not distinct from it. Memory is allocated for different purposes with different behaviour in terms of scope and persistence, and to facilitate that the linker segments (or divides up) the memory for different purposes.

In many embedded systems, the code (text segment) and constant data reside in ROM which is physically different from RAM. The linker needs to know where that ROM space is located in the memory map.

The stack is temporary space used for local data storage, function parameters and return call/function addresses. It is continuously used and reused as functions are called and variables go in and out of scope.

Heap is used for dynamic memory allocation through functions such as malloc() / free(). It is what memory is allocated from at runtime rather then being statically allocated or automatically allocated on the stack. Heap allocations persist until they are explicitly returned to the heap rather than having "scope" and being automatically instantiated and destroyed.

The data segment is where statically allocated data resides. This is where static and global data reside. Objects in this memory are instantiated are program start and persist for as long as the code is executing.

In practice there are generally two segments for static data, data and bss. data is for explicitly non-zero initialised data. They exist in read/write memory, but the initialiser values for this memmory are in text. When the program starts, the start-up code that runs before main() copies the initial values to the allocated RAM segment. The bss segment is simply initialised to zero - the default initial value for static data.

So:

  • bss and data must be distinct spaces to facilitate efficient initialisation.
  • text must be distinct because it is eother located and extecuted in-place in in ROM, or in systems where it is loaded in RAM, it will be done so most efficiently by copying a contiguous block of code to the run-time location.
  • heap is a run-time pool of memory. It is certainly possible to distribute the heap across non-contiguous memory, but in the simple case it will generally be a single contiguous block.
  • The stack concept is an artefact of how (most) microprocessors work at the machine level, so it is a natural model for a compiled language. The stack segment itself is the call/data stack used in the main() thread. Some processors switch to a separate stack for interrupt handling (some don't). If multi-threading is used, typically each thread has its own stack. These thread stacks may be instantiated dynamically from the heap or statically allocated in bss for example.

The point is that C code is compiled to object code and then linked to form the final binary executable. The linker is responsible for locating code and data so requires a memory map to know what to put where. The stack must be contiguous because that is how the machine works and it is required for local automatically created and destroyed data.

Clifford
  • 88,407
  • 13
  • 85
  • 165
  • I think all of us here know what stack, heap, data, code segment is used for and i can read about them by google. But in my question, i asked about "why" we have to seperate the physical memory into many segment stack, heap,... In your answer, you said that: "Memory is allocated for different purposes with different behaviour in terms of scope and ... for different purposes". I think this answer is close to my problem. Could you explain more detail about this ? And the is that Operating System create heap, stack,...segment ? – Tiến Thành Nguyễn Jun 24 '20 at 15:15
  • @TiếnThànhNguyễn : your question referred to baremetal, and you changed the question significantly after I answered. You are now conflating the issue with talk of virtual memory. Not all systems, and certainly not many bare metal systems use virtual memory. Many microcontrollers even lack an MMU in any case. The question is now all about the execution environment and little to do specifically with C. As such your question is now unclear and lacking focus. I shall not add to the answer. I suggest you start a new question rather than changing this one. – Clifford Jun 24 '20 at 16:24
  • i found that the OS (exactly is kernel) will create the virtual memory from physical memory. And i add a new question about the reason for creating virtual memory here: (https://stackoverflow.com/questions/62566970/why-os-create-the-virtual-memory-for-running-process) – Tiến Thành Nguyễn Jun 25 '20 at 02:37
2

lets put things and technical words in their correct context. stack , heap , text , ..etc part of process structure or memory layout of a process and not a "memory layout" as you mentioned ! Now alot of people and engineers confused regarding the difference between process and program, i will try to explain in my answer below.

Now what is a process ?

A process is an instance of an executing program. ob the other hand A program is a file containing a range of information that describes how to construct a process at run time. This information includes the following:

Binary format identification: Each program file includes metainformation describing the format of the executable file. two widely used formats for UNIX executable files were the original a.out (“assembler output”) format and the later, more sophisticated COFF (Common Object File Format).

Machine-language instructions: These encode the algorithm of the program.

Program entry-point address: This identifies the location of the instruction at which execution of the program should commence.

Data: The program file contains values used to initialize variables and also literal constants used by the program (e.g., strings).

Other information: The program file contains various other information that describes how to construct a process including (Symbol and relocation tables, Shared-library and dynamic-linking information and more).

process is an abstract entity, defined by the kernel, to which system resources are allocated in order to execute a program. From the kernel’s point of view, a process consists of user-space memory containing program code and variables used by that code, and a range of kernel data structures that maintain information about the state of the process. The information recorded in the kernel data structures includes various identifier numbers (IDs) associated with the process, virtual memory tables and more!

Memory Layout of a Process

lets start with the process memory layout figure here:

x-----------------------------------x
x  Kernel data (not accissible to   x
x  the program)                     x
x-----------------------------------x
x program environment variables     x
x-----------------------------------x
x          STACK                    x
x       grows downwards             x
x-----------------------------------x
x                                   x
x        Unallocated Memory         x    
x                                   x
x                                   x
x                                   x
x-----------------------------------x
x                                   x
x        ^                          x
x        ^       HEAP               x
x        | grows upwards            x
x-----------------------------------x
x               BSS                 x
x-----------------------------------x
x       Initialized data            x
x-----------------------------------x
x           Text                    x
x    (the C code in our case)       x 
x-----------------------------------x
x                                   x
x-----------------------------------x
             

The memory allocated to each process is composed of a number of parts, usually referred to as segments. These segments are as follows:

The text segment:

contains the machine-language instructions of the program run by the process. The text segment is made read-only so that a process doesn’t accidentally modify its own instructions via a bad pointer value.

The initialized data segment

contains global and static variables that are explicitly initialized. The values of these variables are read from the executable file when the program is loaded into memory.

The uninitialized data segment (BSS)

contains global and static variables that are not explicitly initialized. Before starting the program, the system initializes all memory in this segment to 0. This is often called the BSS segment. The main reason for placing global and static variables that are initialized into a separate segment from those that are uninitialized is that, when a program is stored on disk, it is not necessary to allocate space for the uninitialized data. Instead, the executable merely needs to record the location and size required for the uninitialized data segment, and this space is allocated by the program loader at run time.

The stack

is a dynamically growing and shrinking segment containing stack frames. One stack frame is allocated for each currently called function. A frame stores the function’s local variables (so-called automatic variables), arguments, and return value.

The heap

is an area from which memory (for variables) can be dynamically allocated at run time. The top end of the heap is called the program break. This section and allocations maintained by malloc() family (system call) which executed only in run time.

Memory layout of a process glossed over the fact that the layout is in virtual memory! And not physical memory as others said before!

Now in most modern embedded systems, there is a real time operating system (RTOS), which creates and handles (often) a light wieght process (threads). in these systems users (engineers) have more flexibilty in managing the system resourcess (like malloc), and since virtual memory is not exist in these systems users can determine and handle all the above section ram mapping.

for further reading:

an excelent book The linux programming interface (most of my answer taken from there), and more about RTOS see here RTOS

Adam
  • 2,820
  • 1
  • 13
  • 33
0

I somewhat disagree with the accepted answers to the questions linked by WeatherVane. The really important thing to take away here is:

Stack, Heap, Data segment, Text segment -- all these are implementation details that the C language itself makes no statement about.

And I agree with P__J__, you simply should not bother with them at this point, and learn the generic language first before delving into platform-specific details.


As far as the language C is concerned, there is a thing called "automatic storage duration". This applies e.g. to variables declared at block scope, with their storage allocated at declaration, and deallocated when the block they were declared at is left.

This is often implemented by means of a stack, but the standard makes no statement about implementation. A platform with lots of CPU registers could put variables with automatic storage duration in those registers.


Then there is "dynamically allocated memory" (the memory you get through malloc() et al. and release with free()), which is usually implemented by some kind of heap, but again the standard makes no statement about implementation. This could be directly mapped to permanent storage for all that matters.


"Data segment" (usually holding value initialization data) and "text segment" (usually holding executable code) are, again, implementation details, in this case of the executable file format (PE, ELF, ...) You obviously need to have your executable code somewhere, and if you set a variable to a literal value that value has also to "exist" somewhere, but as far as the language C is concerned, that is an issue for the compiler / executable loader / platform to worry about.

An embedded platform might have those hard-coded in ROM, and you might not even have a data segment / text segment in this case...


Bottom line, don't bother about these at this point. Learn about the language first, and then learn about how its generic principles (like automatic storage duration and dynamically allocated memory) apply to a given platform. The latter usually involves digging into compiler / loader specifics, and you should not go there without a firm grasp of the language readily established.

DevSolar
  • 67,862
  • 21
  • 134
  • 209