2

I'm trying to create a trivial console assembly program with arguments. Here's the code:

.section __TEXT,__text

.globl _main

_main:

    movl    $0, %edi
    callq   _exit

Here's the compile and link script:

as test.s -o test.o
ld test.o -e _main -o test -lc 

Now the program either fails with a segmentation fault, or executes without error, depending on the argument count:

$ ./test
Segmentation fault: 11
$ ./test 1
$ ./test 1 2
$ ./test 1 2 3
Segmentation fault: 11
$ ./test 1 2 3 4
$ ./test 1 2 3 4 5
Segmentation fault: 11

And so on.

Under the LLDB I see a more informative error:

Process 16318 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x00007fffad14b2fa libdyld.dylib`stack_not_16_byte_aligned_error
libdyld.dylib`stack_not_16_byte_aligned_error:
->  0x7fffad14b2fa <+0>: movdqa %xmm0, (%rsp)
    0x7fffad14b2ff <+5>: int3   

libdyld.dylib`_dyld_func_lookup:
    0x7fffad14b300 <+0>: pushq  %rbp
    0x7fffad14b301 <+1>: movq   %rsp, %rbp

Indeed, if I stop the execution at the first line of the program, I see that the stack is not 16-byte aligned for some argument count, while it is aligned for another. Although the System V ABI for AMD64 states that:

The stack pointer holds the address of the byte with lowest address which is part of the stack. It is guaranteed to be 16-byte aligned at process entry.

What am I missing?

Forketyfork
  • 7,416
  • 1
  • 26
  • 33
  • Looks like OS X is violating the ABI for some lengths of args. (Hopefully they never claimed to follow exactly that ABI? I thought they did, too, though.) Have you tried without libc, just checking stack alignment yourself (with a debugger) on process entry to a statically-linked binary that exits with a `syscall` directly? (Or set exit status = `rsp & 0xf` and `./test 1 2 3 ; echo $?` to see the stack alignment). – Peter Cordes Sep 20 '17 at 01:48

1 Answers1

2

I guess on OS X the kernel doesn't guarantee a stack alignment on entry to main. You have to manually align the stack. Fortunately, this is rather easy, just zero-out the least four bits of the stack pointer. In case you need to fetch the argument vector or other data, make sure to store the original stack pointer somewhere:

_main:
    mov %rsp,%rax    # copy original stack pointer
    and $-16,%rsp    # align stack to 16 bytes
    push %rax        # save original stack pointer
    push %rbp        # establish...
    mov %rsp,%rbp    # ...stack frame
    ...              # business logic here
    leave            # tear down stack frame
    pop %rsp         # restore original stack pointer
    ...              # exit process

You also need to mentally keep track of your stack alignment. It might be easier to have main do nothing but the stack alignment and then calling your actual main function so you can use a normal stack frame in it:

_main:
    mov %rsp,%rbx    # save original stack pointer
    and $-16,%rsp    # align stack to 16 bytes
    call _my_main    # call actual main function
    mov %rbx,%rsp    # restore original stack pointer
    ...              # exit process

For your particular example program, you can just use this minimal code:

_main:
    and $-16,%rsp    # align stack to 16 bytes
    xor %edi,%edi    # set exit status to zero
    call _exit       # exit program
fuz
  • 88,405
  • 25
  • 200
  • 352
  • Looks like it was the case. I based my code on the second snippet, and now the manual alignment works. Thanks! – Forketyfork Sep 19 '17 at 21:04
  • System V Application Binary Interface AMD64 Architecture Processor Supplement (With LP64 and ILP32 Programming Models) Draft Version 0.99.8: *%rsp The stack pointer holds the address of the byte with lowest address which is part of the stack. It is guaranteed to be 16-byte aligned at process entry.* Notably this is different than the `(%rsp+8)%16 == 0` for normal functions. – EOF Sep 19 '17 at 21:35
  • @EOF: That's exactly what you want for making a function call. At the start of a normal function, you need a dummy `push`, or `sub $8, %rsp`, before a `call`. Or `jmp exit` to tailcall. Because the stack is supposed to be 16B-aligned before you run a `call` instruction, so the args are aligned. Also, note the OP's error message: `0x...a` is not 0 or 8. – Peter Cordes Sep 20 '17 at 01:43
  • @Fuz: The OP is using `_main` as an alternate name for `_start`; note the `ld test.o -e _main -o test -lc` to set the ELF entry point to `_main`. So there's nothing to return to. Very poor choice of name, IMO, but the state of the stack is the kernel-startup state (or dynamic linker state?), not the alignment for a C `main()`. – Peter Cordes Sep 20 '17 at 01:44
  • @PeterCordes Oh indeed! For OP, note that it's a very bad idea to call any libc functions from a program that didn't initialize the C runtime environment by being linked through the C compiler. Don't do that! – fuz Sep 20 '17 at 07:43
  • More like you have to init libc in a static binary. `gcc -static -nostartfiles foo.o` will still link libc, but start at your entry point instead of CRT. On Linux, `gcc -nostartfiles foo.o` will make a dynamic executable, and libc startup functions will be called by the dynamic linker before you reach `_start` (because it has a list of function pointers in a `.init` section or something). @SergeyPetunin, see also https://stackoverflow.com/questions/36861903/assembling-32-bit-binaries-on-a-64-bit-system-gnu-toolchain/36901649 for stuff about linking with/without libc and static vs. dynamic. – Peter Cordes Sep 20 '17 at 07:57
  • Thanks for your advices, I've replaced `_main` with default entry point `start` (without the underscore, though — not sure if it's an OS X thing), but that didn't change much. Running through lldb, I found that in any way the execution starts with some boilerplate code which calls something like `dyldbootstrap::start`, and libc seems to be dynamically loaded and initialized correctly (at least `_printf` is working as expected). But the stack is still misaligned upon entering my code and has to be corrected manually. – Forketyfork Sep 20 '17 at 18:41