2

What is the rationale behind separating memory into code segment and data segment ? I got to read in one source that since von-neumann architecture instructions and data are stored in the same place owing to which this separation is done .

But I want to know what are the possible problems that might be caused beacuse of instructions and data being stored in the same place owing to which there is a necessity of separation into code segment and data segment ?

And why further into bss segment ,heap segment and stack segment ?

  • There is no necessity in that. In an OS that by default maps code read-only, you can `mprotect` or `VirtualProtect` to make it writeable. Without an OS, you're free to use a flat memory model on most ISAs. Sections / segments are just a convenience for linking, and for loading. (The BSS has to be special, though, because the zeros aren't stored explicitly anywhere). Related: [x86 Assembly: Data in the Text Section](https://stackoverflow.com/q/46203110), and [What is the reason for having read-only data defined in .text section?](https://stackoverflow.com/q/51042388) – Peter Cordes Sep 23 '18 at 06:52
  • Semi,-related, if you were thinking of x86 segment registers: [Does Linux not use segmentation but only paging?](https://unix.stackexchange.com/q/469253). Mainstream x86 OSes use flat memory models, and that's nearly unrelated to segments of executables. – Peter Cordes Sep 23 '18 at 07:09
  • Segmenting is a very old concept, stopped being relevant 25 years ago. It was a hack to get a 16-bit processor to address more than 65536 bytes of memory. Also used in ancient protected mode operating systems to assign protection attributes to memory regions, all long forgotten. 32-bit processors with a built-in MMU made it obsolete, you can now count on a flat memory space and paging. Talk to your teacher, ask him how what you are learning applies to practical skills. 64-bit computing is mainstream now, you want to know enough about it. – Hans Passant Sep 23 '18 at 07:53
  • @HansPassant: It's not clear if the OP is conflating x86 segments with executable segments or not. Modern Unix systems do still use the term "segment" for regions of an ELF executable that are mapped separately: text segment (linkers put `.text` and `.rodata` here), data segment (`.data` section), BSS segment, and maybe some other segments that aren't mapped into memory and just hold metadata. [What's the difference of section and segment in ELF file format](https://stackoverflow.com/q/14361248). – Peter Cordes Sep 23 '18 at 22:08

1 Answers1

5

One the biggest benefits of separating programs into code and data sections is that it allows the code section to made read-only, while the data sections can be kept writable. This protects the code section from being accidentally modified by the program, and also allows it to be shared between processes running the same program.

More recently the ability to only allow the execution of code in the code section has become important. This is because a number of exploits depend on being able to execute code outside of the code section. Another more recent benefit of separating code and data is that out-of-order CPUs could end up speculatively executing data resulting in poorer performance if code and data were intermingled.

The bss section exists as an extension of the data section. It contains all the program's data that is initialized to zero. By separating the zero initialized data out like this it allows the the bss section to not actually be stored in the program's executable. The both saves disk space and speeds up loading. The bss section in memory is simply filled with zeros rather than reading it into memory.

The heap and stack are similar in that they're both used to allocate objects dynamically as the program runs. The difference between the two is that objects allocated on the heap can be freed at anytime, while anything put on the stack can only be removed after everything put on the stack after it is removed. Every time a function is called space gets allocated on the stack for things like arguments to the function, the return address and local variables. When the function returns these things get removed from the stack. Pretty much anything else that gets dynamically allocated is allocated on the heap.

Traditionally on Unix systems the heap was located in memory after the bss section, which appears after the code and data sections. As things got allocated on the heap it grew upwards to make space for them as necessary. The stack was placed at the end of memory, and grew downwards.

Ross Ridge
  • 38,414
  • 7
  • 81
  • 112
  • Is this separation a processor level construct or OS level construct ? Do they coincide or relate to the content of the segment registers as I think or am I confusing up this to ? – Amigorust Burrough Sep 23 '18 at 10:29
  • @AmigorustBurrough It's usually an OS level construct implement using various processor features. It's not fundamentally linked to the x86 segment registers, as it's implemented on a number of systems that don't use x86 CPUs. Even on x86 CPUs modern operating systems don't use x86 segmentation to separate code and data. This is why I've used "section" instead of "segment" in my answer. A lot of the time these terms are used interchangeably, but on x86 CPUs segments can refer to x86 hardware segments. x86 CPUs don't have bss or heap segment registers, so I assumed you really meant sections. – Ross Ridge Sep 23 '18 at 15:03
  • @AmigorustBurrough: ELF executable segments are nearly unrelated to x86 segment registers. See my answer on [Does Linux not use segmentation but only paging?](https://unix.stackexchange.com/a/469529). (ELF segments are for the program loader, ELF sections are for the linker. They do have distinct technical meanings. [What's the difference of section and segment in ELF file format](https://stackoverflow.com/q/14361248). But they're close enough that it does make sense to simplify to just "section" when avoiding confusion with x86 segments.) – Peter Cordes Sep 23 '18 at 22:12