2

I have started the assembly.

I don't understand why I have two variables before argc.

Image of Stack

What is the 0000 and the 0008 ?

global _main

section .text
_main:
    ; write
    mov rax, 0x2000004
    mov rdi, 0x1
    mov rsi, [rsp+24]
    mov rdx, 3
    syscall

    ; return (0)
    mov rax, 0x2000001
    mov rdi, 0x0
    syscall

I'm on macOSX Mojave and I compile with:

nasm -f macho64 ex01.s && ld -macosx_version_min 10.14 -lSystem ex01.o
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Clement
  • 23
  • 5
  • What is the 0000 and the 0008? Your `argv` is an array of pointers with a sentinel `NULL` after the last argument. `argv` always contains the executable name (`argv[0]`) and since no additional arguments were passed, `argv[1]` will be `NULL`. – David C. Rankin Aug 09 '19 at 19:06
  • see https://en.wikipedia.org/wiki/Crt0, which illustrates `_start` setting up some parameters and calling `_main`. You might try to find the particular `_start` that is being used in your environment to see how `_main` is being called. My guess is that +0 is a return address for `_start`. – Erik Eidt Aug 09 '19 at 19:13
  • 1
    Do not post pictures of terminal output. Instead, post this output as text. – fuz Aug 09 '19 at 20:01

2 Answers2

3

You're targetting modern MacOS, hence ld will emit dyld assisted LC_MAIN load command for entry point handling. The [rsp] is the return address to libdyld _start function epilogue:

mov        edi, eax ; pass your process return code as 1st argument under System V 64bit ABI
call       exit ;from libSystem
hlt

What it means you don't need to exit your process through a system call like you do in:

; return (0)
mov rax, 0x2000001
mov rdi, 0x0
syscall

Instead:

xor eax,eax
ret

is enough (and that's what compilers will emit btw).

Your buffer will also get flushed in the ret / libdyld approach. That's irrelevant for your system write call you are doing, but could be for a printf for instance.

Here's a great article that describes lots of details.

Kamil.S
  • 5,205
  • 2
  • 22
  • 51
2

I don't understand why I have two variables before argc.

You wrote a main, not a _start. The stack space above your return address is "not yours"; there's no standard for how much stack space the CRT startup code uses before calling main, or what it leaves on the stack between the argc/argv/env and the call to main.

In main(int argc, char **argv, char **envp), you'll find argc in EDI, a pointer to argv[] is in RSI, and a pointer to envp[] in RDX.

But we can look and see what's there to reverse-engineer main's caller:


The numbers starting with 0000 are byte offsets relative to RSP. Whatever generated your image is dumping and analyzing 8-byte stack "slots" as integers, and as pointers if they point to valid memory.

All this stuff on the stack got there by the _start code that calls main putting it there, or the kernel putting it there before entering user-space.

  • [rsp + 0] has main's return address, so it points to code. Presumably _start called your main with code like call main / mov edi, eax / call exit to pass your return-value to exit() if main returns (which yours doesn't). So it makes sense that your return address is pointing at a mov edi, eax.
  • 0 is probably a frame-pointer sentinel, for the benefit of code that's compiled with -fno-omit-frame-pointer being able to back-trace a chain of saved-RBP values. Pushing a 0 in _start terminates that linked list, if the caller then does mov rbp, rsp so a push rbp in its callee will push a pointer to that terminator. The x86-64 System V ABI doc suggests doing this.

The rest of the entries look exactly like the entry-to-user-space state of the stack at _start

  • 1 = argc means you ran the program with no args, so the shell passed 1 implicit first arg (the program name, argv[0]).
  • then a NULL-terminated argv[] (not a pointer to argv, the actual array is right there on the stack). The first element is a pointer to the string holding the path to your executable, because your caller chose to pass that to execve() as per usual
  • then a NULL-terminated envp[] array. Again not char **envp but the actual array by value. Each entry is a char* to an entry in the environment.

Again, the x86-64 System V ABI documents this stack layout. MacOS follows the x86-64 System V ABI. https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI


(I'm surprised about stack alignment though. On Linux RSP is 16-byte aligned on entry to user-space; it's not a function and isn't called so there's no return value on the stack. So argc is 16-byte aligned. But here, your code seems to show that rsp in main has the same alignment as argc. That would mean main's caller had the stack 8 bytes away from 16-byte alignment before the call. Maybe that's what OS X always does?)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847