How does an OS affect how assembly code runs?

Question

I'm hoping to learn assembly language for x86. I'm on a Mac, and I'm assuming most x86 tutorials/books use code that's meant for Windows.

How does the OS that code is run on affect what the code does, or determine whether the code even works? Could I follow a Windows-based tutorial, and modify a few commands to make it work for Mac with relative ease? More generally, is there anything tricky that a Mac assembly programmer, specifically, should know? Thanks!

score 24 · Accepted Answer · edited Jun 20 '20 at 09:12

(Of course, all of the following applies only to x86 and x86-64 assembly language, for IA-32 and AMD64 processors and operating systems.)

The other answers currently visible are all correct, but, in my opinion, miss the point. AT&T versus Intel syntax is a complete non-issue; any decent tool will work with both syntaxes or have a counterpart or replacement that does. And they assemble the same anyway. (Protip: you really want to use Intel syntax. All the official processor documentation does. AT&T syntax is just one giant headache.) Yes, finding the right flags to pass to the assembler and linker can be tricky, but you'll know when you've got it and you only have to do it once per OS (if you remember to write it down somewhere!).

Assembly instructions themselves, of course, are completely OS-agnostic. The CPU does not care what operating system it's running. Unless you're doing extremely low-level hackery (that is, OS development), the nuts and bolts of how the OS and CPU interact are almost totally irrelevant.

The Outside World

The trouble with assembly language comes when you interact with the outside world: the OS kernel, and other userspace code. Userspace is trickiest: you have to get the ABI right or your assembly program is all but useless. This part is generally not portable between OSes unless you use trampolines/thunks (basically another layer of abstraction that has to be rewritten for every OS you intend to support).

The most important part of the ABI is whatever the calling convention is for C-style functions. They're what are most commonly supported, and what you're probably going to be interfacing with if you're writing assembly. Agner Fog maintains several good resources on his site; the detailed description of calling conventions is particularly useful. In his answer, Norman Ramsey mentions PIC and dynamic libraries; in my experience you usually do not have to bother with those if you do not want to. Static linking works fine for typical uses of assembly language (like rewriting core functions of an inner loop or other hotspot).

The calling convention works in two directions: you can call C from assembly or assembly from C. The latter tends to be a bit easier but there's not a big difference. Calling C from assembly lets you use things like the C standard library output functions, while calling assembly from C is typically how you access an assembly implementation of a single performance-critical function.

System Calls

The other thing your program will do is make system calls. You can write a complete and useful assembly program that never calls external C functions, but if you want to write a pure assembly language program that doesn't outsource the Fun Stuff to someone else's code, you are going to need system calls. And, unfortunately, system calls are totally and completely different on every OS. Unix-style system calls you'll need include (but are most assuredly not limited to!) open, creat, read, write, and the all-important exit, along with mmap if you like allocating memory dynamically.

While every OS is different, most modern OSes follow a general pattern: you load the number of the system call you want into a register, typically EAX in 32-bit code, then load the parameters (how you do that varies wildly), and finally issue an interrupt request: it's INT 2E for Windows NT kernels or INT 80h for Linux 2.x and FreeBSD (and, I believe, OSX). The kernel then takes over, executes the system call, and returns execution to your program. Depending on the OS, it might trash registers or stack as part of the system call; you'll have to make sure you read the system call documentation for your platform to be sure.

`SYSENTER`

Linux 2.6 kernels (and, I believe, Windows XP and newer, though I have never actually attempted it on Windows) also support a newer, faster method to make a system call: the SYSENTER instruction introduced by Intel in newer Pentium chips. AMD chips have SYSCALL, but few 32-bit OSes use it (though it's the standard for 64-bit, I think; I haven't had to make direct system calls from a 64-bit program so I'm not sure on this). SYSENTER is significantly more complicated to set up and use (see, for example, Linus Torvalds on implementing SYSENTER support for Linux 2.6: "I'm a disgusting pig, and proud of it to boot.") I can personally attest to its peculiarity; I once wrote an assembly function that issued SYSENTER directly to a Linux 2.6 kernel, and I still don't understand the various stack and register tricks that got it to work... but work it did!

SYSENTER is somewhat faster than issuing INT 80h, and so its use is desirable when available. To make it easier to write both fast and portable code, Linux maps a VDSO called linux-gate into the address space of every program; calling a special function in this VDSO will issue a system call by the fastest available mechanism. Unfortunately, using it is generally more trouble than it's worth: INT 80h is so much simpler to do in a small assembly routine that it's worth the small speed penalty. Unless you need ultimate performance... and if you need that, you probably don't want to call into a VDSO anyway, and you know your hardware, so you can just do the horribly unsafe thing and issue SYSENTER yourself.

Everything Else

Other than the demands imposed by interacting with the kernel and other programs, there are very, very few differences between operating systems. Assembly exposes the soul of the machine: you can work as you like, and inside your own code you are not bound by any particular calling convention. You have free access to the FPU and SSE units; you can PREFETCH directly to stream data from memory into the L1 cache and make sure it's hot for when you need it; you can munge the stack at will; you can issue INT 3 if you want to interface with a (properly configured; good luck!) external debugger. None of these things depend on your OS. The only real restriction you have is that you are running at Ring 3, not Ring 0, and so some processor control registers will be unavailable to you. (But if you need those, you're writing OS code, not application code.) Other than that, the machine is laid bare to you: go forth and compute!

score 6 · Answer 2 · answered Jul 24 '09 at 01:25

Generally speaking, as long as you use the same assembler, and the same architecture (for example, NASM, and x86-64), you should be able to assemble assembly on both Windows and Mac.

However, it is important to keep in mind that the executable formats and the execution environments may differ. As a example, Windows might emulate/handle certain privileged instructions differently to Mac, causing different behavior.

score 3 · Answer 3 · answered Jul 24 '09 at 01:58

Also a big part of the difference is in how the program communicates with the outside world.

For example if you want to display a message to the user or read a file or allocate more memory you have to ask the OS to do it by making some kind of system call. That'll be quite different between OS's.

The language syntax itself should be basically identical as long as you're using the same assembler. Different assemblers sometimes have slightly different ordering on syntax or different macros but nothing that's too hard to get used to.

score 2 · Answer 4 · answered Jul 24 '09 at 02:22

The Great Divide in Intel assembly language is between AT&T syntax and Intel syntax. You'll want an assembler for your Mac that uses the same syntax as any tutorials you use. Since I believe MacOS Darwin, a BSD variant, uses AT&T syntax, and the Microsoft assembler uses Intel syntax, you'll need to be careful.

The other difference to beware of is the system's Application Binary Interface (ABI), which covers calling conventions, stack layout, system calls, and so on. They may differ substantially between OS's, especially when it comes to position-independent code and dynamic linking. I have vague unhappy memories that PIC was especially complicated on the PowerPC MacOS, but maybe it's simpler on the Intel.

One piece of advise: learn x86_64 (also known as AMD64)—it's a lot more fun to write assembly code by hand, and you'll be more future-proofed.

score 2 · Answer 5 · answered Jul 26 '09 at 09:35

When I dipped into Assembly during one of my programming tourist visits, the gotcha that held me up in every tutorial was not being able to compile in the correct binary format. Most tutorials give elf (for Linux) and aoutb (for BSD), yet with the latter (logical choice?) OS X complains:

ld: hello.o bad magic number (not a Mach-O file)

yet Mach-O fails as a format, and if you man nasm you get only bin, aout and elf file formats - man ld is no more helpful - macho is the option to make the Mach-O format for OS X:

nasm -f macho hello.asm

I wrote up the journey here (includes a link to a nice TextMate bundle for Assembly and other info), but - to be brief - the above is what you need to get started.

score 2 · Answer 6 · edited Jun 10 '22 at 14:57

If you already knew asm details for Mac and Windows, some tutorials could be ported easily. But if you need to learn what the tutorial is trying to teach in the first place, you won't know which parts need porting! Same for different bitness; 64-bit code uses different calling conventions from 32-bit code on the same OS, so make sure you're building your code for the same mode as the tutorial. e.g. for a 32-bit tutorial on modern GNU/Linux, build with gcc -m32 -no-pie -fno-pie foo.s

You need a tutorial for the OS and mode you're working with (whether that's in an emulator or VM, or native). And some random source you find using int 21h DOS or Linux int 0x80 system calls has zero chance of working if you copy/paste it into a program for any other OS.

(*BSD and MacOS do use int 0x80 for 32-bit system calls, but with a different calling convention. Nothing else uses DOS/BIOS calls; unfortunately in some ways, cursor movement isn't as simple under modern OSes as it was in real mode bootloaders and DOS, nor is raw keyboard input without echo to the screen or waiting for a newline.)

Assembly language is incredibly non-portable, in terms of symbol-name conventions and system-call and library calling conventions / ABIs.

Don't think of it like C or Python as a single language for writing programs. It's a language for generating machine code (and binary data in non-code sections) from human-readable text source. That's the part that's the same everywhere, for a given assembler (like NASM or GAS).

The machine code (and symbol metadata) that's appropriate for one OS is not appropriate for another OS.

But for pure computation, with the same assembler (and thus same syntax), for the same mode, yes, things are pretty much the same: div edx will always cause a #DE divide exception under any OS for example.

Different assemblers have significant differences in syntax, e.g. MASM mov eax, foo is a load from that symbol in memory, but NASM mov eax, foo puts the pointer in a register with mov-immediate.

This is all well and good as long as you realize that MASM code might look like the NASM code you're working with (just different directives), but the same instruction could mean something different.

(That implicit deref in MASM even without [], and in general MASM's weird handling of [], is one of the few cases where the same instruction could assemble to mean something different. movzx eax, byte [foo] is another: in MASM, byte expands to 1, so it's movzx eax, 1[foo] aka [foo+1].)

Once you have a pointer to an array, and a length, and know which registers you're allowed to modify without saving/restoring, then you can write code which sums the array which will work on any OS with just adaptation to the calling convention and name mangling (leading underscores or not, like C int foo(int *arr, size_t len) might be foo or _foo in asm depending on that OS you're targeting.

Agner Fog has a good document about calling conventions, as part of his x86 asm optimization guides.

So once you get beyond the super basics of how to write working functions for your system with your toolchain, things like Agner Fog's asm optimization guide are useful for anyone. The most work you'd have to do is mentally port from Intel syntax to AT&T, if you're stuck using that.

x86 CPUs run the same machine code; different source syntax doesn't change that. (And all the mainstream syntaxes can express everything that machine code can do; that's where the limits on what can go into a single addressing mode come from, for example.)

See also: Stack Overflow tag wikis for links to guides and manuals

How does an OS affect how assembly code runs?

6 Answers6

The Outside World

System Calls

`SYSENTER`

Everything Else

Linked