Abstraction and bare-metalness with OSes?

Question

If I have a CPU and I wrote a program and I wanted to store a value (copy a value from a register to memory (RAM)) then I would use an instrucction in the CPU's instruction set (lets say that this is an x86 CPU) to do so?

Second question, is the instruction in th x86 instruction set to set a value at a particular address in RAM called MOV?

Third question. BIOS, UEFI, kernels and bootloaders all use the MOV instruction in the x86 instruction set to do so (assign a value (like 10) to a specific address in RAM) right?

Fourth question. Programs that operate in the OS (with a kernel (like Linux)) environment do not use the MOV instruction to get a chunk of memory allocated when they request for it but rather ask the kernel to do it on their behalf?

Fifth question. Is what I described in the fourth question called a system call (when a program running in the OS environment asks the kernel to do something (in this case give it some memory) on its behalf)?

1. yes. 2. yes, the most common x86 store instruction is `mov`. e.g. `mov dword [rdi], 10`. 3. that's oddly specific. You could probably write a bootloader without any static data, so all your mov-immediate instructions would use register destinations. — Peter Cordes, Dec 04 '17 at 10:29
For question three I will re-phrase it by asking then. Do all of the programs mentioned in that question (BIOS, UEFI, kernel and bootloaders) use instructions in the CPU's instruction set to store values in memory (RAM)? — Tristan B. Kildaire, Dec 04 '17 at 10:38
Yes, any non-trivial program that needs more space than registers, or that needs to call a function or syscall that returns a value in memory, has to use memory. Part of what a bootloader has to do is load from disk into memory, so while it might not use any store instructions itself, the BIOS or UEFI functions it call will do so on its behalf. — Peter Cordes, Dec 04 '17 at 10:50

Peter Cordes · Answer 1 · 2017-12-04T10:52:01.410

0

0. I think you don't know enough about assembly language to even formulate a sensible question that really fills in whatever gap you're missing, but I've tried to answer your questions as they are. I'm really not sure what you really want to know.

Perhaps check out Matt Godbolt's CppCon2017 talk: “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid” for a beginner intro to x86 asm.

But I think you're missing some concepts more about stack / heap / static storage locations. See Stack, Static, and Heap in C++

yes
yes, the most common x86 store instruction is mov. e.g. mov dword [rdi], 10. Or with an absolute address encoding into the instruction as well, mov dword [my_static_32_bit_location], 10
that's oddly specific. You could write a bootloader without any static data. All your mov-immediate instructions would use register destinations.
mov doesn't ever allocate or reserve memory. Programs running under an OS can have static data. e.g. consider this C function:
```
unsigned LCG_rand() {
    static unsigned seed = 0x1234;
    seed = seed * 0x5677 + 0x7723;  // Linear Congruential Generator with badly chosen coefficients I just made up
    return seed;                    // mod type-width is implicit
}
```
gcc7.2 targeting x86-64 Linux compiles this to (Godbolt compiler explorer):
```
LCG_rand:
    imul    eax, DWORD PTR seed.2294[rip], 22135    # eax = load(seed) * 22135
    add     eax, 30499                              # eax += 30499
    mov     DWORD PTR seed.2294[rip], eax           # store the old value back to memory.
    ret

.data            # static read-write data goes in the .data section
.align 4
seed.2294:       # label which the compiler uses to refer to it.
    .long   4660
```
The .data section holds static read-write data with non-zero initializers. The initial values are stored literally in the executable. It's linked into the data segment of the executable. When you run it, the data segment of the executable is mapped into the process's memory space with read/write permissions, copy-on-write (private). (So writing the data in memory doesn't update the data on disk, the way it would if you used mmap with MAP_SHARED.)

These addresses are link-time constants, so a program can use it directly. (In the code above, the function is using PC-relative addressing, but absolute would be supported too. You may have to use -no-pie on some gcc setups, though.)

Exactly the same machine code would work fine in a bare-metal environment, even with paging disabled so the addresses you're using are physical addresses, instead of the usual virtual addresses which let every process have its own view of memory. (Err, but you'd probably have to be in 16 or 32-bit mode, I forget if x86-64 long mode even works with paging disabled.)
yes. See What are the calling conventions for UNIX & Linux system calls on i386 and x86-64 for details on how user-space processes make system calls.

edited Dec 04 '17 at 10:52

answered Dec 04 '17 at 10:46

Peter Cordes

328,167
45
605
847

Ah so I think I understand better now. As you said, the same machine code (of the program you wrote), would run in a bare-metal environment. So any progrram, be it a kernel (like Linux) or a user-level progam (something that humans interact with) like a web-browser (like FireFox) would both use the CPU's instruction set's instructions to get work done and in the case of wanting to store something in memory they would use one of the `MOV` (or storing instructions) to do so? – Tristan B. Kildaire Dec 04 '17 at 11:22
So in the bare-metal case it would run using physical addresses but when a kernel comes into the picture then it maps memory in a different way for programs that need to get a peice of memory. – Tristan B. Kildaire Dec 04 '17 at 11:26
@TristanB.Kildaire **Storing to and loading from memory is a normal part of getting work done**. 32-bit x86 only has 7 general-purpose registers. It's common for code to need to keep track of more than 7 variables at a time, so compilers use stack memory as extra local storage space. We're just talking about RAM, not permanent non-volatile storage like a hard drive. Even 64-bit code uses the stack frequently. – Peter Cordes Dec 04 '17 at 11:26
Ah okay. I think I was blurring the line between bare-metal programs and the same ones that run in an OS environment. Both obviously compile to assembly and use the given CPU's instruction set to get work done. In the bare-metal acse the program uses physical addresses and has full control over the machine whilst in an OS environment it has virtual memory access (which is controlled by the kernel) which the same program would then use. **Is this correct?** – Tristan B. Kildaire Dec 04 '17 at 11:28
@TristanB.Kildaire: How you *allocate* memory is different for bare-metal vs. under an OS, but you always have a stack, and if you want any static data you can have it. For bare metal you can write your own allocator to give you dynamic storage because you already own all the memory (that's one of the defining features of bare metal). But once you've established which addresses your code can use, it's not a question of "blurring lines". But maybe that's just the wrong expression for what your confusion was. I hope. – Peter Cordes Dec 04 '17 at 11:31
Maybe my use of the word allocate was wrong. What I meant to say is, the program you wrote above was able to, as you said, run on both an OS environment and on bare-metal? – Tristan B. Kildaire Dec 04 '17 at 11:42
Ah I see. But for bare-metal I need not really write an allocator if say now all I wanted to do was to store a value (a 1 byte value) at the beginning of RAM. – Tristan B. Kildaire Dec 04 '17 at 11:50
@TristanB.Kildaire: That's right, you don't need an allocator for static data, only for dynamic storage. When writing your program, you decide while you're writing it that you want this function to use this fixed location for something (like the `static unsigned seed` in the RNG). Usually the assembler makes it easy to name these locations and make them part of your executable bare-metal image with all instructions referring to locations by named label (like `seed.2294`) all using the same absolute address. So yes, that compiler-generated asm could be built for bare metal. – Peter Cordes Dec 04 '17 at 11:55
The key thing is where the assembler and linker (and your linker config file) *put* the `.data` section, and how your bare-metal program loader (or the BIOS if your program is loaded from the boot sector directly) puts everything where your code is going to look for it. Getting static data in the right place is not a different problem from getting code in the right place, although code is more often position-independent. (Can run from any address using relative `jmp` instructions, rather than absolute. But some code does use absolute jumps, too) – Peter Cordes Dec 04 '17 at 11:59
I am still a Grade 12 student. Do you think I should maybe wait till next year when I begin studying Computer Science to learn about these things as I seem dumbfounded. – Tristan B. Kildaire Dec 04 '17 at 12:15
@TristanB.Kildaire: Learn to program first (especially in a language like C), then you'll have the background for understanding computer organization / architecture. I definitely haven't given you a complete explanation, because that would be too long, so it's not surprising that a lot of the specific things don't make sense. You might want to try reading a (free) book called [Programming From the Ground Up](https://savannah.nongnu.org/projects/pgubook/). Also in HTML form: https://programminggroundup.blogspot.ca/2007/01/chapter-1-introduction.html. It teaches 32-bit x86 assembly language. – Peter Cordes Dec 04 '17 at 12:39
I am reading something on wikipedia now about supervisor mode and user mode and how the ine restricts certain instructions to be executed (user mode). This seems to make some more sense now. – Tristan B. Kildaire Dec 05 '17 at 08:52

Abstraction and bare-metalness with OSes?

1 Answers1