7

here is the code(exit.s):

.section .data,
.section .text,
.globl _start
_start:
    movl $1, %eax
    movl $32, %ebx
    syscall

when I execute " as exit.s -o exit.o && ld exit.o -o exit -e _start && ./exit"

the return is "Bus error: 10" and the output of "echo $?" is 138

I also tried the example of the correct answer in this question: Process command line in Linux 64 bit

stil get "bus error"...

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
springrider
  • 470
  • 1
  • 6
  • 19
  • possible duplicate of [GNU Assembler (Mac OS X 64-bit): Illegal instruction: 4](http://stackoverflow.com/questions/11178313/gnu-assembler-mac-os-x-64-bit-illegal-instruction-4) – Alexey Frunze Jun 24 '12 at 20:54
  • Also [How to get this simple assembly to run?](//stackoverflow.com/a/34191324) is a new duplicate that also explains how to make OS X system calls. I'm not sure if one answer is better / more useful than the other. – Peter Cordes Jul 09 '19 at 04:30

2 Answers2

19

First, you are using old 32-bit Linux kernel calling convention on Mac OS X - this absolutely doesn't work.

Second, syscalls in Mac OS X are structured in a different way - they all have a leading class identifier and a syscall number. The class can be Mach, BSD or something else (see here in the XNU source) and is shifted 24 bits to the left. Normal BSD syscalls have class 2 and thus begin from 0x2000000. Syscalls in class 0 are invalid.

As per §A.2.1 of the SysV AMD64 ABI, also followed by Mac OS X, syscall id (together with its class on XNU!) goes to %rax (or to %eax as the high 32 bits are unused on XNU). The fist argument goes in %rdi. Next goes to %rsi. And so on. %rcx is used by the kernel and its value is destroyed and that's why all functions in libc.dyld save it into %r10 before making syscalls (similarly to the kernel_trap macro from syscall_sw.h).

Third, code sections in Mach-O binaries are called __text and not .text as in Linux ELF and also reside in the __TEXT segment, collectively referred as (__TEXT,__text) (nasm automatically translates .text as appropriate if Mach-O is selected as target object type) - see the Mac OS X ABI Mach-O File Format Reference. Even if you get the assembly instructions right, putting them in the wrong segment/section leads to bus error. You can either use the .section __TEXT,__text directive (see here for directive syntax) or you can also use the (simpler) .text directive, or you can drop it altogether since it is assumed if no -n option was supplied to as (see the manpage of as).

Fourth, the default entry point for the Mach-O ld is called start (although, as you've already figured it out, it can be changed via the -e linker option).

Given all the above you should modify your assembler source to read as follows:

; You could also add one of the following directives for completeness
; .text
; or
; .section __TEXT,__text

.globl start
start:
    movl $0x2000001, %eax
    movl $32, %edi
    syscall

Here it is, working as expected:

$ as -o exit.o exit.s; ld -o exit exit.o
$ ./exit; echo $?
32
Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186
  • thanks for the help! could I use -arch i386 to make the 32-bit assembly program works? I tried it but get "Illegal instruction: 4". just like the question referred by Alex ,"http://stackoverflow.com/questions/11178313/gnu-assembler-mac-os-x-64-bit-illegal-instruction-4" I am not sure how to I compile it for 32-bit – springrider Jun 25 '12 at 08:42
  • 1
    `syscall` is not a valid 32-bit instruction. And the `int $0x80` i386 syscalls use BSD style arguments passing via the stack. See [here](http://osxbook.com/blog/2009/03/15/crafting-a-tiny-mach-o-executable/) or dive [here](http://fxr.watson.org/fxr/source/bsd/dev/i386/systemcalls.c?v=xnu-1699.24.8#L92). – Hristo Iliev Jun 25 '12 at 11:15
3

Adding more explanation on the magic number. I made the same mistake by applying the Linux syscall number to my NASM.

From the xnu kernel sources in osfmk/mach/i386/syscall_sw.h (search SYSCALL_CLASS_SHIFT).

/*
 * Syscall classes for 64-bit system call entry.
 * For 64-bit users, the 32-bit syscall number is partitioned
 * with the high-order bits representing the class and low-order
 * bits being the syscall number within that class.
 * The high-order 32-bits of the 64-bit syscall number are unused.
 * All system classes enter the kernel via the syscall instruction.

Syscalls are partitioned:

#define SYSCALL_CLASS_NONE  0   /* Invalid */
#define SYSCALL_CLASS_MACH  1   /* Mach */  
#define SYSCALL_CLASS_UNIX  2   /* Unix/BSD */
#define SYSCALL_CLASS_MDEP  3   /* Machine-dependent */
#define SYSCALL_CLASS_DIAG  4   /* Diagnostics */

As we can see, the tag for BSD system calls is 2. So that magic number 0x2000000 is constructed as:

// 2 << 24
#define SYSCALL_CONSTRUCT_UNIX(syscall_number) \
            ((SYSCALL_CLASS_UNIX << SYSCALL_CLASS_SHIFT) | \
             (SYSCALL_NUMBER_MASK & (syscall_number)))

Why it uses BSD tag in the end, probably Apple switches from mach kernel to BSD kernel. Historical reason.

Inspired by the original answer.

Izana
  • 2,537
  • 27
  • 33