x86_64 Assembly Linux System Call Confusion

Question

I am currently learning Assembly language on Linux. I have been using the book 'Programming From the Ground Up' and all the examples are 32-bit. My OS is 64-bit and I have been trying to do all the examples in 64-bit. I am having trouble however:

.section .data

.section .text
.global _start
_start:
movq $60, %rax
movq $2, %rbx
int $0x80

This merely just calls the Linux exit System call or it should. Instead it causes a SEG FAULT and when I instead do this

.section .data

.section .text
.global _start
_start:
movq $1, %rax
movq $2, %rbx
int $0x80

it works. Clearly the problem is the value I move to %rax. The value $1 that I use in the second example is what 'Programming From the Ground Up' said to use however multiple sources on the Internet have said that the 64-bit System Call Number is $60. Reference What am I doing wrong? Also what other issues should I watch out for and what should I use for a reference? Just in case you need to know, I am on Chapter 5 in Programming From The Ground Up.

basically a duplicate: [What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?](https://stackoverflow.com/q/46087730) - `int $0x80` still invokes the 32-bit ABI, using 32-bit registers and call numbers. Just really just use [Assembling 32-bit binaries on a 64-bit system (GNU toolchain)](https://stackoverflow.com/q/36861903) for following a 32-bit tutorial. — Peter Cordes, Feb 16 '21 at 04:41

score 20 · Accepted Answer · edited Aug 02 '17 at 07:24

20

You're running into one surprising difference between i386 and x86_64: they don't use the same system call mechanism. The correct code is:

movq $60, %rax
movq $2,  %rdi   ; not %rbx!
syscall

Interrupt 0x80 always invokes 32-bit system calls. It's used to allow 32-bit applications to run on 64-bit systems.

For the purposes of learning, you should probably try to follow the tutorial exactly, rather than translating on the fly to 64-bit -- there are a few other significant behavioral differences that you're likely to run into. Once you're familiar with i386, then you can pick up x86_64 separately.

edited Aug 02 '17 at 07:24

Yuhong Bao

3,891
1
19
20

answered Dec 14 '11 at 19:27

I am probably going to end up doing that. Thank you for your response. – Hudson Worden Dec 14 '11 at 19:32
One should use `%rdi` for the first system call argument, not `%rbx`. – Adam Zalcman Dec 14 '11 at 20:18
9

`sysenter` doesn't work on my system. The correct instruction is `syscall`. – hpsMouse Dec 27 '12 at 05:39
1

@hpsMouse, `syscall` is the AMD-originated version, while `sysenter` is the Intel-originated version. More details here: http://wiki.osdev.org/SYSENTER – Dan Lenski Jul 20 '15 at 06:28
To clarify, syscall is always used for native x86-64 calls (not 32-bit compatibility mode). Intel had to implement it. – Yuhong Bao Aug 02 '17 at 07:25

score 14 · Answer 2 · edited May 23 '17 at 12:00

14

please read this What are the calling conventions for UNIX & Linux system calls on x86-64

and note that using int 0x80 for syscall on x64 systems is an old compatibility layer. you should use syscall instruction on x64 systems.

you can still use this old method, but you need to compile your binaries in a x86 mode, see your compiler/assembler manual for details.

edited May 23 '17 at 12:00

Community

1
1

answered Dec 14 '11 at 19:26

zed_0xff

32,417
7
53
72

Glad to see someone pointing out that x86_64 Linux actually uses `syscall` not `sysenter`! I wrote a [longer answer to explain the confusion between the two](//stackoverflow.com/a/31510342/20789). – Dan Lenski Jul 20 '15 at 06:56

score 7 · Answer 3 · answered Jul 20 '15 at 06:53

duskwuff's answer points out correctly the mechanism for system calls is different for 64-bit x86 Linux versus 32-bit Linux.

However, this answer is incomplete and misleading for a couple reasons:

The change was actually introduced before 64-bit systems became popular, motivated by the observation that int 0x80 was very slow on Pentium 4. Linus Torvalds coded up a solution using the SYSENTER/SYSEXIT instructions (which had been introduced by Intel around the Pentium Pro era, but which were buggy and gave no practical benefit). So modern 32-bit Linux systems actually use SYSENTER, not int 0x80.
64-bit x86 Linux kernels do not actually use SYSENTER and SYSEXIT. They actually use the very similar SYSCALL/SYSRET instructions.

As pointed out in the comments, SYSENTER does not actually work on many 64-bit Linux systems—namely 64-bit AMD systems.

It's an admittedly confusing situation. The gory details are here, but what it comes down to is this:

For a 32bit kernel, SYSENTER/SYSEXIT are the only compatible pair [between AMD and Intel CPUs]

For a 64bit kernel in Long mode only… SYSCALL/SYSRET are the only compatible pair [between AMD and Intel CPUs]

It appears that on an Intel CPU in 64-bit mode, you can get away with using SYSENTER because it does the same thing as SYSCALL, however this is not the case for AMD systems.

Bottom line: always use SYSCALL on Linux on 64-bit x86 systems. It's what the x86-64 ABI actually specifies. (See this great wiki answer for even more details.)

My understanding is that modern AMD CPUs support `sysenter` from 32-bit user-space (into a 64-bit kernel). But in any case, you shouldn't use `sysenter` directly; it's not officially supported. Either use `int $0x80` for simple beginner code, or `call` into the VDSO to let the kernel-supplied code run `sysenter` or whatever is most efficient on your CPU. https://blog.packagecloud.io/eng/2016/04/05/the-definitive-guide-to-linux-system-calls/ — Peter Cordes, Feb 16 '21 at 04:45

score 5 · Answer 4 · edited May 23 '17 at 12:00

5

Quite a lot has changed between i386 and x86_64 including both the instruction used to go into the kernel and the registers used to carry system call arguments. Here is code equivalent to yours:

.section .data

.section .text
.global _start
_start:
movq $60, %rax
movq $2, %rdi
syscall

Quoting from this answer to a related question:

The syscall numbers are in the Linux source code under arch/x86/include/asm/unistd_64.h. The syscall number is passed in the rax register. The parameters are in rdi, rsi, rdx, r10, r8, r9. The call is invoked with the "syscall" instruction. The syscall overwrites the rcx register. The return is in rax.

edited May 23 '17 at 12:00

Community

1
1

answered Dec 14 '11 at 19:42

Adam Zalcman

26,643
4
71
92

I am also doing the stuff in shellcoder's handbook 2.However,my os is also 64 bit.I assemble with nasm-option -f elf32 and i link also with -melf_i386. I use gdb and also evans debugger (however,wrong it is compiled for 64 and it prompts after loading a binary,but inside edb,inside gdb and in execution, the error is the same): segfault after an instruction AFTER int 0x80.And I only want to EXIT.I have seen,the (r)ax register must contain 0x3c instead of 0x01 (because the syscall file for 64 bit is used),but I cannot imagine,why the instruction after int 0x80 is called.Anyone knowing, why ? – icbytes May 15 '15 at 16:33

score 3 · Answer 5 · edited Oct 01 '12 at 06:51

3

If you check /usr/include/asm/unistd_32.h exit corresponds to 1 but in /usr/include/asm/unistd_64.h exit corresponds to 60.

edited Oct 01 '12 at 06:51

answered Sep 29 '12 at 15:43

Calamar

1,547
1
13
25

Yes, and `int 0x80` always uses unistd_32 call numbers (and 32-bit registers). Only `syscall` in 64-bit code uses unistd_64.h call numbers and RDI, RSI, ... arg registers. – Peter Cordes Jun 14 '21 at 12:55

x86_64 Assembly Linux System Call Confusion

5 Answers5

Linked

Related