Do computer programs/OSes consist of only the X86-64 instructions at low level?

Question

I am sorry for a newbie/stupid question, but this has bothered me some time and a straight up answer seems difficult to find. The question is about how computers work at a low level - more specifically whether there are commands the computer can execute that are NOT included in the x86-64 instructions. Put differently you could ask is an OS programmed only by using the x86-64 instructions, and the same for the programs the OS runs. Note that I am not asking about hidden commands or additional commands specific to a processor, we may assume those do not exist.

Motivation for the question:

The account that is often given is that the compiler complies a program of specific language to machine code. But, there are many commands that can not be (to my knowledge) written in assembly with only the x86-64 instructions. Even something simple like "malloc". So it appears actual programs written for an OS consist of machine code and OS instructions?
If the x86-64 instruction set is looked upon it seems that the I/O commands such as access to keyboard, mouse, hard-drive, GPU, audio interface, time, monitor, speakers etc. does not all have commands for it, although the INT command can be used to accomplish some of the tasks. According to this answer "On modern architectures, peripherals are accessed in a similar way to memory: via mapped memory addresses on a bus.", whatever that means in terms of code. So it appears even the OS is not written only in x86-64 instructions?

From your title question, no processor I know of can run assembly code. They only run machine code. assembly language is a human readable/writeable programming language defined by the assembler (the tool not the target) that allows a tool (the assembler) to generate machine code for you so you dont have to type the bits/bytes in yourself. — old_timer, Jul 30 '20 at 17:06
@Old_timer Sure, and even more precisely processors execute instructions and do not "run code". The title was chosen to make clear that I am referring specifically to the x64-86 instructions, for there may be other machine code or instructions. — Tony, Jul 30 '20 at 17:31
@Tony: There's nothing incorrect about saying that CPUs run machine code. You *can* be more precision if you want, but the bytes from memory they fetch / decode / execute *are* machine code (not assembly). So it's a mistake to say they don't "run code". — Peter Cordes, Jul 30 '20 at 18:06
@PeterCordes Is it still not an abuse of language? Code refers to a representation. Computers do not understand representation, but concrete current. Of course, short hands are legitimate, which is just what was intended by "run assembly code". In any case, computers do NOT run any ones and zeros, which is machine code, the representation of the concrete machine instructions. — Tony, Jul 30 '20 at 18:30
The term "machine code" describes the sequence of bytes or words that CPUs decode as instructions. The term "run" (in the context of a machine instruction) means to fetch, decode, and execute it. I think you're making a distinction between the internal control signals produced by the decoder vs. the external bit-pattern in memory. That distinction is *not* implicit in the usage of "run" even in cpu-architecture discussions. (Fun fact: the MIPS machine-code instruction format was designed so the bits could mostly be used almost directly as internal control signals, i.e. trivial decoding) — Peter Cordes, Jul 30 '20 at 18:35
@PeterCordes No dictionary seems to include this definition of "run". But yes I agree this is how it is used in programming slang. And similarly, the shorthand I intended by "run assembly code" is to convert it to machine code and then "run" that code. Anyway, the title has now been changed hopefully it is now fine. — Tony, Jul 30 '20 at 18:58
As an expert(?) in this stuff, I'm telling you how the term is understood by when used in discussion of details of CPU behaviour. Formally you might say execute instead of run. Yes, "run assembly code" is well-understood, but imprecise. "run machine code" is also well understood, but is true all the way down, unless you want to talk about how x86 CPUs decode machine code to uops internally and actually execute those (and even cache the decode results) https://www.realworldtech.com/sandy-bridge/. But from outside the CPU, a logic analyzer could in theory see it fetch machine code from memory. — Peter Cordes, Jul 30 '20 at 19:02
Not doubting the expertise, and thank you for the answer. It is a bit funny how "run" can be defined to mean something different for machine code, but not for assembly code, in the programming slang. In common language both of the statements are not true, machines do not run any code but what is represented by the code in the specific system - write down some zeros and ones and see what the computer says if you don't believe me :-). Anyway, it was definitely not meant to be a semantics question. — Tony, Jul 30 '20 at 19:19
@Tony this seems extremely pedantic to the point of being a useless distinction; there is no situation in which a CPU "running code" means anything other than reading, interpreting, and executing instructions—which is abstract language that purposely glosses over the reams of physics, electrics, and philosophy that capture the nature of the CPU as a tangible entity and instead anthropomorphizes the object to relate it to familiar human concepts and strip away contextually unimportant, invariant details that would distract from the relevant, variant parts. — Ezekiel Victor, Jun 11 '21 at 02:11
...in much the same way as it is unambiguous to say that I am writing you a comment, yet I am writing nothing but actually typing; of course I'm not typing but actually pressing buttons on an array of buttons that... — Ezekiel Victor, Jun 11 '21 at 02:14

Peter Cordes · Accepted Answer · 2020-07-30T18:43:16.183

Yes, CPUs can only run machine code (which you can 1:1 represent via asm). For some languages, ahead-of-time compilers turn source into machine code in an executable.

For others, e.g. Java, it's typical to JIT-compile to machine code in a buffer in memory on the fly, then call it. (The code that does the JIT compiling was originally written in C, but was compiled ahead-of-time to machine code in the java executable itself).

In other language implementations, you just have an interpreter: it's a program (normally written in an ahead-of-time compiled language like C or C++) that reads a file (e.g. a bash or python script) and parses it, deciding which of its existing functions to call with what args based on the contents of the file. Every instruction that runs was originally in the binary, but there are conditional branches in that interpreter code that depend on the high-level-language code in the file you ran it on.

malloc isn't a fundamental operation, it's a library function (compiled to machine code) which might make some system calls (involving running some machine code in the kernel).

With a full-system emulator like BOCHS, you can literally single-step machine instructions through any program, into system calls, and even for interrupt handlers. You will never find the CPU executing anything that isn't machine code instructions; that's literally the only thing its logic circuits know how to decode after fetching from memory. (Being able to be decoded by the CPU is what makes it machine code).

Machine code always consists of a sequence of instruction, and every ISA has an assembly language that we can use for human-readable representations of machine code. (related: Why do we even need assembler when we have compiler? re: the existence of assembly language instead of just machine code).

Also, the instruction format any given ISA is at least somewhat consistent. On x86-64 it's a byte-stream of opcode, operands (modrm + optional other bytes), and optional immediate. (Also prefixes... x86-64 is kind of a mess.) On AArch64, machine instructions are fixed-width 4 bytes, aligned on 4-byte boundaries.

"On modern architectures, peripherals are accessed in a similar way to memory: via mapped memory addresses on a bus."

That means executing a store instruction like x86-64 mov [rdi], eax to store 4 bytes into memory at address=RDI. Logic inside the CPU (or northbridge in older systems) decides whether a given physical address is DRAM or I/O based on the address, rather than based on the instruction.

Or x86-64 has instructions to access I/O space (separate from memory space), like in and out.

Re: New title:

Do computer programs/OSes consist of only the x86-64 instructions at low level?

No, most programs and OSes also contain some static read-write data (.data) and read-only constants (an .rodata section), instead of purely code with constants only as immediate operands.

But of course data doesn't "run", so maybe that's not what you meant. So yes, unless you want to play semantics with firmware.

Drivers for some modern I/O devices need firmware binary blobs (part of which is machine code for the microcontroller embedded in the GPU, sound card, or whatever).

From the OS's point of view, this is just binary data that it has to send to a PCIe device before it will respond to MMIO operations the way its documentation says it will. It doesn't matter to the OS how the non-CPU device uses that data internally, whether it's actually instructions for a microcontroller or whether it's just lookup tables and samples for a sound card's MIDI synthesizer.

Quick addendum. Given that the push and mov commands have to be modified to avoid memory conflicts, is it the case then that the OS does not run the machine code directly, but substitutes an equivalent? For the actual machine level push command then, is the stack frame the entire memory of the machine? — Tony, Jul 30 '20 at 22:29
@Tony: What modification are you talking about? No, none of that sounds like an accurate description of anything. The first part sounds a bit like runtime fixup relocations of absolute addresses in the machine code (modifying it like data *before* it runs), if non-position-independent code had to get relocated. (e.g. in a shared library.) Normally virtual memory allows the same code to use the same address in a different process without any problem. (so addresses map to different phys mem, instead of changing the machine code). The 2nd part barely makes sense, and is definitely not true. — Peter Cordes, Jul 30 '20 at 22:48
So I write a program `MOV [EAX], EBX` , but say the memory in [EAX] is already in use. This is a valid instruction, but windows/OS would (hopefully) not let you run it. Hence the code can not be run as such but the memory locations need to be modified/given a stack. So you are not actually running the machine code of the program but a modification. The second question was the same, but in inverse, at the machine level the push command should go "all the way down". Hope it made more sense :-). — Tony, Jul 30 '20 at 23:13
@Tony: Like I said, [virtual memory](https://en.wikipedia.org/wiki/Virtual_memory) gives every process its own private address space. No matter what happens, what runs *is machine code*, even if that machine code needed fixups vs. what was on disk. The first part sort of made sense but is wrong. The 2nd part, push going "all the way down" doesn't even make sense. OSes don't check code for bugs before running it; that's impossible in the general case. But CPUs have memory protection HW that OSes can use so they fault if a program uses memory it doesn't own. — Peter Cordes, Jul 31 '20 at 02:42
@Tony: Also, `mov [eax], ebx` doesn't contain any absolute addresses as part of the instruction itself, only a reference to a register, so a program-loader relocating this machine code wouldn't need to modify it. If an earlier instruction was `mov eax, OFFSET global_array`, then there might be metadata marking a 4-byte chunk of that instruction as containing a 32-bit absolute address. Also remember: the CPU hardware itself executes instructions from memory, not from disk. Modifying or generating new instructions in memory doesn't change the fact that only instructions execute. — Peter Cordes, Jul 31 '20 at 03:15
So if I have a new computer that does not have an OS and run the mov command, what is the mov command's offset relative to. Seems like it should range through all of the ram with possibility of moving to any register? The second part also assumes no OS, on an OS you of course get a stack overflow error after using the 8mb or whatever. — Tony, Jul 31 '20 at 12:23
@Tony: Any instruction that actually runs will be at some absolute address. Its addressing mode will calculate some other absolute address (or the same, if it's reading or modifying itself). re: the 2nd part: so you mean the stack *region* or mapping, not the "stack frame" of one function. `push` doesn't care about the big picture, it just does exactly what it says in the manual. If that results in a store to an unmapped page, it raises a #PF exception. Like @old_timer said: "*Processors are very dumb, very very dumb, they only do what the instructions tell them to do.*" That's the key. — Peter Cordes, Jul 31 '20 at 12:31
@Tony: https://www.felixcloutier.com/x86/push is an extract from Intel's PDF manuals, and documents the exact effect of `push` on the architectural state of the machine (memory and register values), and the possible exceptions it can raise. It sounds like you have some major misconceptions might be cleared up if you actually learned some asm and got some experience single-stepping it in a debugger. — Peter Cordes, Jul 31 '20 at 12:34
@Tony Of course you can run the instruction. The instruction just does a different thing, depending on the flags for the page that contains the address stored in eax. If the flag says you can write to the page, it does that. If the flag says you can't, then it does a bunch of other reads and writes, sets the privilege level flag to "operating system code is now running", and sets the program counter to an address where there is operating system code. — user253751, Aug 19 '20 at 12:11

score 4 · Answer 2 · edited Aug 03 '20 at 11:43

I think you are overcomplicating this. Processors are very dumb, very very dumb, they only do what the instructions tell them to do. The programmer ultimately is responsible for laying a path of valid, sane instructions out in front of the processor in the way that a train is dumb and only follows its tracks, if we don't lay the tracks properly the train will derail.

compilers as a program in general convert from one language to another, not necessarily from C to machine code. It could be from who knows JAVA to C++ or something. And not all C compilers output machine code, some output assembly language then an assembler is called.

gcc hello.c -o hello

gcc the program is mostly just a shell program that calls a pre-parser, that does things like replace the includes and defines in a recursive way so that the output of that parser is a single file that can be fed to the compiler. That file is then fed to the compiler which may produce other files or internal data structures and ultimately the actual compiler outputs assembly language. As shown above then gcc calls the assembler to turn the assembly language into an object file with as much machine code as it can manage, some external references are left for the linker, the code was generated to deal with these in a sane way per the instruction set.

The linker then as directed by whomever prepared this toolchain combines the linker from binutils with the C library bundled with the toolchain, or pointed to by the toolchain and links the hello object file with any other libraries needed including the bootstrap, as shown above a linker script prepared by/for the C library in question is used since one was not indicated on the command line. The linker does its job of placing items where asked as well as resolving externals and at times adding instructions to glue these separate objects together, then outputs a file in the file format that was set as default when the toolchain was built. And then gcc goes and cleans up the intermediate files either as it goes or at the end, whatever.

A compiler that compiles straight to machine code simply skips the step of calling the assembler but linking of separate objects and libraries with some form of instructions to the linker about address space is still necessary.

malloc is not an instruction, it is a function that is fully realized in machine code after that function is compiled, for performance reasons it is not uncommon for a C library to create that function in assembly language by hand, either way it is just some other code that gets linked in. A processor can only execute instructions implemented in that processors logic.

The software interrupts are just instructions, when you execute a software interrupt it is really nothing more than a specialized function call, and the code you are calling is yet more code that someone wrote, compiled into machine code, no magic.

A processor has absolutely no idea what usb is or pcie or a gpu, etc. It only knows the instruction set it was implemented to execute, that is all. All of those other high level concepts are not even known by the programming languages even high level ones like C, C++, JAVA, etc. to the processor there are some loads and stores, memory or I/O in the case of x86, the sequence and address of those is the job of the programmer, to the processor its just instructions with addresses, nothing magic nothing special. The addresses are in part part of the system design of the board, where and how you reach a usb controller, pcie controller, dram, video, etc, both the board/chip designers and the software folks know where these addresses are and write code to read/write those addresses to make the peripheral work.

The processor only knows the instructions it has been designed to execute, nothing more, there is generally no magic. CISC processors like the x86, because of the excess complication per instruction, have historically been implemented using microcode for various reasons. So this is an exception to the no magic deal. Using microcode is cheaper in various ways than to discretely implement each instruction with a state machine. The implementation is some combination of state machines and if you will some other instruction set with some other processor, it is not truly an interpreted deal it is a hybrid that makes sense from a business and engineering perspective.

RISC in concept was based on decades of CISC history as well as improvements in the production of products, and tools, and the advancement of programmers abilities, etc. So you now see many RISC processors which are implemented without microcoding, as needed small state machines but in general nothing that would compare to a CISC instruction sets requirements. There is a trade off between number of instructions and code space, vs chip size and performance (power, speed, etc).

"On modern architectures, peripherals are accessed in a similar way to memory: via mapped memory addresses on a bus."

If you were to simply look at the instruction set and best look at the 8088/86 hardware and software reference manuals. Then examine a modern processor bus, there are today many control signals on a bus, indicating not just read vs write and address and data, but type of access, cacheable or not, etc. Going back to the 8088/86 days the designers had a correct notion of the fact that peripherals have two types of controls one is control and status registers, I want to set a graphics mode that is this many pixels by this many pixels. I want it to be this many colors and use a palette that is this depth. Then you have the actual pixels that you want to access ideally in large groups a scan line at a time a frame at a time in a loop/burst copy. So for the control registers you are generally going to access them one at a time, randomly. For the pixel memory you are generally going to access that in bursts sometimes many bytes at a time.

So having a single bit on the bus that indicates I/O vs memory made sense, remember we didn't have fpgas yet, and asics were almost unobtanium, so you wanted to help the glue logic the best you could, so adding a control signal here or there helped. Today in part because relatively the cost and risk of producing asics is cheaper, the tools are much better, programmers skills and how they do things have advanced. The things that helped us in the past can get in the way, so the notion of control vs memory is still very much present in peripherals, but we don't necessarily need to have a control signal nor separate instructions. If you go backward before the 8088/86 to some DEC processors, you had specific instructions for the peripherals, you wanted to output a character to the tty there was an INSTRUCTION for that, not just an address you wrote to. This was the natural progression and today's just make everything memory mapped and use generic load and store instructions.

I can't fathom how you got I/O vs memory to imply there isn't x86 machine code, just look at the instruction set to see the I/O instructions and the memory instructions. They are there, for reverse compatibility reasons which is what kept the Wintel pc world alive for decades they still work, but they are synthesized into something closer to a memory mapped solution, at the same time programmers have migrated away from I/O mapped, it is ideally only very very old code that would be trying to do that, and combination of hardware and software can still make some of that code work on a modern pc.

I so far read only the first paragraph, but want to say this quickly since the other answer also said something similar. Yes I know the processor only executes the x86-64 instructions - but a computer is a lot more than just a processor. — Tony, Jul 30 '20 at 17:17
nit-pick: GCC for *years* now has the C preprocessor integrated into the compiler proper. For example, `/usr/lib/gcc/x86_64-pc-linux-gnu/10.1.0/cc1` runs directly on a `.c` file, not on a `.i` preprocessed source. — Peter Cordes, Jul 30 '20 at 18:09
@Tony: Yes, a computer is more than just a processor, it's also I/O devices that often have their own integrated microcontroller, e.g. an SSD with a full ARM CPU to run the firmware. But it doesn't have to be that way; the peripherals could all be "dumb", no more complex than a parallel port or a serial UART, or the address decode logic in the DRAM, for a headless server. But the main CPU is the only one that we normally talk about *executing instructions*, at least when we're talking about programs running on "the computer". The peripherals can be considered as black boxes. — Peter Cordes, Jul 30 '20 at 18:12

score 1 · Answer 3 · answered Jul 30 '20 at 19:10

Processors execute streams of instructions. These instruction streams are machine code: programming that is written in the machine language executed by the processor.

Various instruction streams have a wide variety of purposes: some load programs, some switch the processor from one instruction stream (program) to another, some protect from other code, some handle device i/o, some are user applications, like databases, or assemblers, compilers, linkers, debuggers.

The processor only knows machine language, and how to execute that. It doesn't even know variable declarations — it is up to the machine code sequence to ensure proper/consistent handling of program variables.

malloc is implemented with an algorithm (as a parameterized function), which is encoded as an instruction stream that can be "called"/invoked by another instruction stream.

Do computer programs/OSes consist of only the X86-64 instructions at low level?

3 Answers3

Re: New title: