6

I am running my code in a 64-bit Linux environment where the Linux Kernel is built with IA32_EMULATION and X86_X32 disabled.

In the book Programming from the Ground Up the very first program doesn't do anything except produce a segfault:

.section .data

.section .text
.globl _start
_start:
movl $1, %eax
movl $0, %ebx

int $0x80

I convert the code to use x86-64 instructions but it also segfaults:

.section .data

.section .text
.globl _start
_start:
movq $1, %rax
movq $0, %rbx

int $0x80

I assembled both these programs like this:

as exit.s -o exit.o
ld exit.o -o exit

Running ./exit gives Segmentation fault for both. What am I doing wrong?

P.S. I have seen a lot of tutorials assembling code with gcc, however I'd like to use gas.

Update

Combining comments and the answer, here's the final version of the code:

.section .data
.section .text

.globl _start
_start:

movq $60, %rax
xor  %rbx, %rbx

syscall
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
Cág
  • 61
  • 5
  • Does the `int 0x80` still exist in x86_64 ? I thought it has been replaced with `syscall` instruction – Pierre Apr 15 '16 at 08:46
  • @Pierre replacing `int 0x80` with `syscall` gives me `Segmentation fault`, too. – Cág Apr 15 '16 at 08:50
  • @MichaelPetch I use custom kernels and disable i386 emulation, as I never run 32-bit binaries. – Cág Apr 15 '16 at 09:13
  • 1
    That is the reason you are segfaulting. Most distros turn on IA32 emulation. If you turn it off you can't use `int 0x80` in 64-bit code. You'll need to use the 64-bit `syscall` instruction and convention. A good starting point for that is http://blog.rchapman.org/post/36801038863/linux-system-call-table-for-x86-64 .If you turn on IA32 emulation your kernel can use`int 0x80` in 64-bit code, but is limited to 32 bit operands and 32-bit pointers. The upper 32-bits aren't used with IA32 emulation. One downside is stack based pointers can't be passed to `int 0x80` – Michael Petch Apr 15 '16 at 09:15
  • 1
    When writing x86-64 code `syscall` instruction is preferred. The calling convention and numbering is different than `int 0x80` – Michael Petch Apr 15 '16 at 09:18
  • `mov $1, %rax` does *exactly* the same thing as `mov $1, %eax`, except that it takes more bytes of machine code to encode `mov r/m64, sign-extended-imm32`. – Peter Cordes Apr 15 '16 at 09:26
  • Using `gcc -nostartfiles` to link your executable is just another way to run `as` and `ld`. Use `gcc -v` to see what it does. – Peter Cordes Apr 15 '16 at 09:31
  • Run `strace ./a.out` to log the system calls being made. This won't change the behaviour of either snippet. Without IA32 support, you'll probably just see a segfault with no system calls. – Peter Cordes Apr 15 '16 at 10:06
  • @MichaelPetch I ran it before posting, right after the first time I saw it. Both `IA32_EMULATION` and `X86_X32` are not set. – Cág Apr 15 '16 at 10:06
  • Better: `mov $60, %eax` / `xor %ebx,%ebx` / `syscall`. Compare the machine code (`objdump -d`) to see why: same effect with fewer bytes. – Peter Cordes Apr 15 '16 at 10:08
  • But I'm asking if it segfaulted. Your question suggests the first code snippet you gave us actually exited (that sounds like you were saying it didn't segfault). I'm actually asking this to ensure your question actually was saying what you actually observe with the code snippets you gave. If you have IA32 emulation off, BOTH your code snippets should have failed with a segfault right on `int 0x80`. As it stands your question says the firts code snippet exited, and the second segfaulted. I don't believe that is accurate if you in fact IA32 emulation off. – Michael Petch Apr 15 '16 at 10:09
  • @MichaelPetch it did segfault. – Cág Apr 15 '16 at 10:11
  • @MichaelPetch thank you. It clarifies the question. – Cág Apr 15 '16 at 10:19

3 Answers3

2

int $0x80 is the 32bit ABI. On normal kernels (compiled with IA32 emulation), it's available in 64bit processes, but you shouldn't use it because it only supports 32bit pointers, and some structs have a different layout.

See the tag wiki for info on making 64bit Linux system calls. (Also ZX485's answer on this question). There are many differences, including the fact that the syscall instruction clobbers %rcx and %r11, unlike the int $0x80 ABI.

In a kernel without IA32 emulation, like yours, running int $0x80 is probably the same as running any other invalid software interrupt, like int $0x79. Single-stepping that instruction in gdb (on my 64bit 4.2 kernel that does include IA32 emulation) results in a segfault on that instruction.

It doesn't return and keep executing garbage bytes as instructions (which would also result in a SIGSEGV or SIGILL), or keep executing until it jumped to (or reached normally) an unmapped page. If it did, that would be the mechanism for segfaulting.

You can run a process under strace, e.g. strace /bin/true --version to make sure it's making the system calls you thought it would. You can also use gdb to see where a program segfaults. Using a debugger is essential, moreso than in most languages, because the failure mode in asm is usually just a segfault.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • [What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?](https://stackoverflow.com/q/46087730) is a more complete canonical version of this answer. – Peter Cordes May 01 '21 at 08:37
2

The first observation is that the code in both your examples effectively do the same thing, but are encoded differently. The site x86-64.org has some good information for those starting out with x86-64 development. The first code snippet that uses 32-bit registers is equivalent to the second because of Implicit Zero Extend:

Implicit zero extend

Results of 32-bit operations are implicitly zero extended to 64-bit values. This differs from 16 and 8 bit operations, that don't affect the upper part of registers. This can be used for code size optimisations in some cases, such as:

movl $1, %eax                 # one byte shorter movq $1, %rax
xorq %rax, %rax       # three byte equivalent of mov $0,%rax
andl $5, %eax         # equivalent for andq $5, %eax

The question is, why does this code segfault? If you had run this code on a typical x86-64 Linux distro your code may have exited as expected without generating a segfault. The reason that your code is failing is because you are using a custom kernel with IA32 emulation off.

IA32 emulation in the Linux kernel does allow you to use the 32-bit int 0x80 interrupt to make calls using the traditional 32-bit system call mechanism. This is an emulation layer, and doesn't support passing pointers that can't be represented in a 32-bit register. This is the case for stack based pointers since they fall outside the 4gb address space, and can't be accessed with 32-bit pointers.

Your system has IA32 emulation off, and because of that int 0x80 doesn't exist for backwards compatibility. The result is that the int 0x80 interrupt will throw a segmentation fault and your application will fail.

In x86-64 code it is preferred that you use the syscall instruction to make system calls to the 64-bit Linux kernel. This mechanism supports 64-bit operands and pointers where necessary. Ryan Chapman's site has some good information on the 64-bit SYSCALL interface which differs considerably from the 32-bit int 0x80 mechanism.

Your code could have been written this way to work in a 64-bit environment without IA32 emulation:

.section .text

.globl _start
_start:

mov  $60, %eax
xor  %ebx, %ebx
syscall

Other useful information on doing 64-bit development can be found in the 64-bit System V ABI. This document also better describes the general syscall convention used by the Linux kernel including side effects in Section A.2 . This document is also very informative if you also wish to interface with third party libraries and modules (like the C library etc).

Community
  • 1
  • 1
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • 1
    That's funny, the example on x86-64.org has the byte savings mis-counted. `gas` assembles `48 c7 c0 01 00 00 00 mov $0x1,%rax` (7B), using the `REX mov r/m64, imm32(sign-extended)` form. And `b8 01 00 00 00 mov $0x1,%eax` (5B), using the `mov r32, imm32` form that encodes the dest reg into the opcode, instead of a mod/rm byte. (The REX form of that is `movabs r64, imm64`, which is why the `mov r64, imm32` has to use the `r/m64` encoding.) Nice answer though. Mine just explained exactly why it faulted, not what to do instead. – Peter Cordes Apr 15 '16 at 11:10
0

The reason is that the Linux System Call Table for x86_64 is different from the table for x86.

In x86, SYS_EXIT is 1. In x64, SYS_EXIT is 60 and 1 is the value for SYS_WRITE, which, if called, expects a const char *buf in %RSI. If that buffer pointer is invalid, it probably segfaults.

%rax    System call   %rdi              %rsi            %rdx      %r10  %r8 %r9
1       sys_write     unsigned int fd   const char *buf size_t 
60      sys_exit      int error_code    
zx485
  • 28,498
  • 28
  • 50
  • 59
  • 1
    This is true. However if a kernel is built with IA32 emulation (most are unless you build a custom one turning it off), `int 0x80` is emulated. `int 0x80` in 64-bit code is limited to using only 32-bit operands (obviously) but still matches the IA32 `int 0x80` convention. I queried the OP in the comments and he confirms he turned off IA32 emulation. Had he turned it on this code would have worked. Of course `syscall` is the preferred method in 64-bit code as you can pass 64-bit pointers and other 64-bit operands. In his case the segault is because `int 0x80` is not emulated by his kernel. – Michael Petch Apr 15 '16 at 09:21
  • Thanks for pointing out. The result code is `movq $60, %rax` instead of `movq $1, %rax` and `syscall` instead of `int $0x80`. – Cág Apr 15 '16 at 09:25
  • Note that `write(2)` returns `-EFAULT` if you pass it a bad pointer. @Cág: This is why you should use `strace` to see what your program does. Probably your system call returns, and then execution continues beyond the `int $0x80`, and the data there, when decoded as x86 instructions, causes a segfault. Or execution continues until instruction fetch itself causes a segfault. This answer is useful but doesn't explain the observed behaviour at all, because the 32bit `int $0x80` ABI is still accessible from 64bit processes, as Michael points out. – Peter Cordes Apr 15 '16 at 09:28
  • @PeterCordes : You have missed something. In this case the OPs code doesn't work because his kernel was built as a custom kernel with IA32 emulation off. His code is failing because `int 0x80` isn't supported at all in his environment. My first comments to his question queried whether he happened to build a custom kernel with IA32 emulation off, and that is the case.This question is a dupe of another, I just can't find it. Recently someone else ran a custom kernel with IA32 off and had same issue. He has no choice but to use `syscall` or rebuild his kernel with IA32 emulation on. – Michael Petch Apr 15 '16 at 09:32
  • @MichaelPetch: yeah, I remember that question. And no, I didn't miss that, I just didn't find a good way to say everything I meant. But obviously the only way for the kernel to return from `int $0x80` / `eax=1` is if IA32 support is not present. (Or maybe the `int $0x80` itself segfaults directly in that case, IDK). – Peter Cordes Apr 15 '16 at 09:46
  • @MichaelPetch `syscall` though was not the only error in the original code; I used `sys_write` instead of `sys_exit`. – Cág Apr 15 '16 at 09:47
  • 1
    @Cág : Your code shown in your question doesn't use `syscall` instruction, It uses `int 0x80` which is a different mechanism. There is a part of me that believes if you took the first example code you posted here as is and built it (and ran it), and you have IA32 emulation off, it should segfault (not exit gracefully). The 2 examples you posted in your question should have both failed (I'm pretty sure) in an environment with IA32 emulation off. – Michael Petch Apr 15 '16 at 09:49
  • Thanks for the additional information. @Peter Cordes: You may be right about `-EFAULT`, but from my experience programming linux assembly, I (nearly?) always get a segfault returned on command line. Maybe the output lacks differentiation, I never investigated that. If the `int 0x80` just IRETs and continues afterwards to nowhere or if it's an issue with the IDT, IDK either. – zx485 Apr 15 '16 at 09:54
  • @zx485: most SO questions where people screw this up segfault because execution continues after something they expected to exit, not because `write(2)` itself faults. I'm 100% sure about this, having tested myself single-stepping in GDB. System call man pages always document returning `-EFAULT` for bad pointers, not actually raising a `SIGSEGV`. AFAIK the only system call that can directly generate a `SIGSEGV` is `kill(self, SIGSEGV)` :P – Peter Cordes Apr 15 '16 at 10:00
  • @MichaelPetch: I'm *sure* it would not exit gracefully. Assuming that disabling IA-32 support doesn't leave any kind of vestigial handler, the IDT entry for `0x80` with is probably the same as the entry for e.g. `int $0x79`. In a 64bit process on my normal 4.2 kernel, single stepping `int $0x79` in gdb generates a SIGSEGV. If it does return at all, then it would just continue on into bogus instructions. – Peter Cordes Apr 15 '16 at 10:04