-8

I know that the basic C program layout in memory is code/text, data, heap and stack. The C program memory layout does correspond to the more general layout of a process in memory.

My questions are :

  1. Where does the "C program layout = General process layout" association stem from ? ( If we consider all the computer science literature out there...)

  2. If I were to write a compiler for a programming language that I create ( let's call it "ONE" ) what are the main rules that I should comply with ( let's say that we are in a Linux OS with intel x86_64 ) ?

alessio solari
  • 313
  • 1
  • 6
  • 1
    https://www.iso.org/standard/74528.html I only care if it runs or not. Haha. And to know what standard it is, perhaps you have to read the book – sycoi001 Aug 08 '23 at 12:34
  • 1
    re: question 1: Let me ask you, where are you getting the terms "C program layout" and "General process layout" from? ...because it sounds like nonsense to me (googling for the term "C program layout" returns nothing but low-value content-farm websites like geeksforgeeks repeating the same thing over...) – Dai Aug 08 '23 at 12:34
  • Dai, when I said C program layout I meant C program memory layout – alessio solari Aug 08 '23 at 12:36
  • 1. UNIX was (re-)written in C. Lots of things in UNIX assume C semantics. Because it's the language the system was implemented in. 2. That means you target [ELF](https://www.ibm.com/docs/en/ztpf/1.1.0.15?topic=linkage-executable-linking-format-elf) and only use the x86_64 instructions and linux system calls. – Elliott Frisch Aug 08 '23 at 12:36
  • _"If I were to write a compiler for a programming language that I create what are the main rules that I should comply with"_ - main rules w.r.t. _what_? The C spec doesn't say anything about how an OS handles a program's memory at all, and not much for compilers either. – Dai Aug 08 '23 at 12:36
  • 1
    @alessiosolari I know you meant that, and that doesn't change my response: that it isn't a meaningful term in the first place, and despite being meaningless it's being repeated by other people (content-farms, etc) who don't know what they're writing about. – Dai Aug 08 '23 at 12:37
  • Dai, how should I call it then ? – alessio solari Aug 08 '23 at 12:38
  • So you are saying that 99% of the books out there are wrong because you say that...come on man – alessio solari Aug 08 '23 at 12:39
  • 1
    @alessiosolari They're using an imprecise terminology. In the olden days, a fairly common executable format was "a.out". Now Linux uses "ELF", on Mac it's "Mach-O" and on Windows it's "COFF". All three OSes have differences in linking and executing dynamic programs, the C standard does not mandate implementation. – Elliott Frisch Aug 08 '23 at 12:43
  • 1
    C language does not know anything about sections and memory layout. But the implementation has to know how to organize things. Eventually, you need to generate the executable file which will follow your OS rules. – 0___________ Aug 08 '23 at 12:44
  • 3
    "99% of books" -- citation needed. – John Bollinger Aug 08 '23 at 13:46
  • @alessiosolari I did a quick search in all of my CS/SE ebooks (Addission Wesley, O'Rielly, etc) and **none** of them use the term "C program [memory] layout". Canonically, I suppose [_the dragon book_](https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools) might say something, but I don't have a searchable copy of it. – Dai Aug 08 '23 at 14:52
  • Dai, take for example "Advanced Programming in the Unix Environment" by Richard Stevens. If you don't know who he is look him up on wiki --> https://en.wikipedia.org/wiki/W._Richard_Stevens – alessio solari Aug 08 '23 at 15:59
  • Dai, this is just one of many many examples. So... – alessio solari Aug 08 '23 at 16:00
  • @alessiosolari If you were to look at the name of the book, you'll see it says "_...in the Unix Environment_" while it doesn't say "_...in C_". The question you've posted to SO is asking about C, not Unix. (Unix is an OS, so naturally that will mean having to follow the OS' requirements for executable image layout and work with how the OS manages memory - but again, that's not really relevant to C at all). – Dai Aug 08 '23 at 17:37
  • Dai, give me some concrete examples where this layout is not "code,data,stack and heap"...the last time I checked all unix-like os follow this layout, windows follow this layout too... I need proof – alessio solari Aug 08 '23 at 20:01
  • @alessiosolari Why are you bringing-up Unix? Your question is about C, not Unix. There are plenty of machines that can run C programs which don’t have the “layout” you describe, e.g. IBM hardware: https://softwareengineering.stackexchange.com/questions/222564/are-there-alternatives-to-stackheapstatic-memory-model – Dai Aug 08 '23 at 20:22
  • Dai, your link doesn't give me concrete proof. I cannot believe what I don't see. – alessio solari Aug 08 '23 at 20:35
  • What, exactly, is your standard of proof then? Have you read this thread? https://stackoverflow.com/questions/51577685/does-c-need-a-stack-and-a-heap-in-order-to-run – Dai Aug 08 '23 at 23:02
  • Dai, on a practical level we can safely say that most computers out there implement a C program and a process in general by following the classic layout "code,data,heap and stack". The examples that you refer to maybe represent the 1%, so "they are the exceptions". When I say the "C program memory layout" I'm referring to the most scenarios that you're gonna run into in the real world. Since this layout is used by unix-like systems and windows too this pretty much cover the majority of systems out there. Here we are not talking about the "small exceptions". The majority makes the rule – alessio solari Aug 09 '23 at 07:34

1 Answers1

5

Where does the "C program layout = General process layout" association stem from ? ( If we consider all the computer science literature out there...)

Each operating system has a method (or methods) of loading programs and starting their execution. This includes formats of executable files that it can read and load. An executable file format typically contains some header that describes what program sections are in the file and then information that describes each program section. Program sections may contain data to be loaded into memory as initial values for objects or may contain instructions to be loaded into memory to be executed. Or a program section may simply be an amount of space that needs to be made available in memory.

There is no C program layout. A compiler is a translator. It translates from the C language into machine language. (This process typically involves multiple steps: Translating C into an internal representation, perform optimization and other operations on the internal representation, translating the internal representation into assembly or a representation of it, translation assembly into machine language, writing machine language and data into object modules, and linking object modules into an executable program.)

In the C source code, an abstract model of computing is used, as described in the C standard. No specific hardware stack is specified (although stack semantics are specified because parameters and automatic objects have last-in first-out behavior in function calls). No specific program layout is specified. The compiler translates C source code to an executable program of the target platform, and that is what gives the program its process layout.

If I were to write a compiler for a programming language that I create ( let's call it "ONE" ) what are the main rules that I should comply with ( let's say that we are in a Linux OS with intel x86_64 ) ?

Write files in the formats expected by the linker and the program loader. Conform to the Application Binary Interface specified by the operating system.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • +1, but could you clarify what you mean by "stack semantics are specified"? The word "stack" does not appear even once in the C17 or C23 spec. – John Bollinger Aug 08 '23 at 13:16
  • I don't know what you mean by "parameters have first-in first-out behavior". Parameters don't *do* anything, so describing their behaviour doesn't make any sense. /// And then there's the issue that a FIFO is a queue, not a stack. So how did you jump from "first-in first-out behavior" to "stack semantics"? /// Please clarify what you mean. by that parenthetical. – ikegami Aug 08 '23 at 18:18
  • @ikegami: “first-in first-out” was supposed to be “last-in first-out”. I corrected that. Parameters and automatic objects and a record of program control exist in multiple instances when a function is called recursively (directly or indirectly), and the behavior is last-in first-out: The most recent instance that has not been popped by a function return (or termination via `longjmp` etc.) is the instance that is used, and popping reveals the most recent instance before that. – Eric Postpischil Aug 08 '23 at 18:21
  • So it's *recursion* that implies stack semantics. – ikegami Aug 08 '23 at 18:23
  • @ikegami: The C standard specifies recursion is supported. C 2018 6.5.2.2 11 says recursive calls shall be permitted, and other parts of the standard specify the semantics of parameters and automatic objects, and those form stack semantics, and therefore the standard specifies stack semantics. – Eric Postpischil Aug 08 '23 at 18:24
  • I didn't say otherwise (cause I didn't know), In fact, I was checking when you wrote that. All the standard says is "Recursive function calls shall be permitted, both directly and indirectly through any chain of other functions." And given that, yes agree. I never disagreed; I just said I couldn't understand what you wrote (because you said parameters when you meant recursion, and you said FIFO where you meant LIFO). – ikegami Aug 08 '23 at 18:26