1

I'm learning assembly and I'm trying to run a quite simple program.

section .text
    global start
    global _main

start:
    call _main
    ret

_main:
    push 42
    ret

I'm using NASM on OSX 64-bits. Here is what I tried :

$ nasm -f macho64 simple.asm -o simple.o
$ ld simple.o -o a.out
$ ./a.out
dyld: no writable segment
[1]    38021 trace trap  ./a.out
$
$ ld -lc -ldylib1.o -e start simple.o -o a.out
$ ./a.out
[1]    38134 segmentation fault  ./a.out
$
$ ld -macosx_version_min 10.8 -lSystem simple.o -o a.out
[1]    38134 segmentation fault  ./a.out
$

Following this post, I added section .data into the code.

$ nasm -f macho64 simple.asm && ld simple.o && ./a.out
[1]    39119 killed     ./a.out

1) How can I get my program not to be killed ?

2) Why do my program get those signals (SIGTRAP, SIGSEGV and SIGKILL) ?

3) Where could I have found those answers without asking ? Explanations I've found until now do require prior knowledge about assembly.

Edit

I understood my mistake with push 42, thank you. My program runs when loaded with ld -macosx_version_min 10.8 -lSystem simple.o. But :

  • I still have the SIGTRAP when loaded with ld simple.o

  • I still have the segfault when loaded with ld -lc -ldylib1.o -e start simple.o

  • I still have a SIGKILL when I add section .data and load with ld simple.o

  • I have a bus error when I add section .data and load with ld -macosx_version_min 10.8 -lSystem simple.o

I wonder why I get those signals (in order to understand how it works). I'd also like to know why I have to specify macosx_version_min and how I could have found it without having a friend telling it to me.

Community
  • 1
  • 1
Bilow
  • 2,194
  • 1
  • 19
  • 34
  • 1
    `ret`, to quote the manual, "transfers program control to a return address located on the top of the stack." That would be 42. – harold Nov 13 '16 at 16:11
  • Are you sure about SIGKILL? If the entry point is in a non-executable page, I would have expected SIGSEGV (at least on Linux). Try with `strace ./a.out` to trace system calls, including the `execve` that execs the file. Also, run `gdb ./a.out` (or `lldb`, or whatever debugger you prefer), and single-step through your code so you see what instruction faults and results in the kernel delivering a signal. – Peter Cordes Nov 13 '16 at 21:10
  • Also, your title is wrong. Your program is most definitely not empty. If it was, you'd just segfault from running the zero-padding as instructions, or SIGILL if execution hit some other bytes that didn't decode as valid instructions. – Peter Cordes Nov 13 '16 at 21:10

2 Answers2

2

1) How can I get my program not to be killed ?

It is not C: you don't ret that way. Your code is not called from a pre-established environment; what you should your returning to? By issuing such instruction, your jmp-ing to whatever value was on the stack currently at %rsp and going in a region potentially outside your process' address space, hence the SIGSEGV.

You have to explicitly tell the OS your process has terminated its execution through a system call.

Also, a ELF's starting point is usually _start, not start, which you should otherwise indicate to the linker.

I still have the segfault when loaded with ld -lc -ldylib1.o -e start simple.o

Don't link with C, unless you're conforming to its execution model. Also I don't understand what you mean by "load with ld file.o".

edmz
  • 8,220
  • 2
  • 26
  • 45
  • If `ld` doesn't find a symbol called `_start`, it defaults to the beginning of the text segment. – Peter Cordes Nov 13 '16 at 20:15
  • @PeterCordes I made a pretty much strong assertion by saying an ELF's starting point is `_start`, I don't know whether that's mandatory (doubt so) or not. I mean, `ld` does that - but if I wrote my linker, could I refuse to link? – edmz Nov 15 '16 at 16:13
  • Yeah, of course. If you don't care about command-line compatibility with `ld`, your linker can do anything you want. Like I said, `ld`'s actual behaviour is to fall back to the start of the .text section: *ld: warning: cannot find entry symbol _start; defaulting to 0000000000400080*. You can also use `ld -e symbol_name` to set a symbol name it will look for instead of `_start`. So even in a non-stripped binary linked by `ld`, it's not totally safe to assume that the entry point is `_start`. To reliably set a breakpoint there when debugging it, use `readelf -a` to find the numeric address – Peter Cordes Nov 15 '16 at 19:08
  • I will say it's a rather risky decision: just having `.text \n .word 0xf0b \n entry: # your code` will be source of long debugging sessions, if you forget that `-e entry`. That constant will be guiltless-looking, but it's not. – edmz Nov 15 '16 at 19:52
  • I didn't say any of that was a good idea, I just explained `ld`'s actual behaviour, which some build systems might rely on. It's usually a terrible idea to use `-e` instead of just calling your entry point `_start` like a normal person. Even worse is if you *have* a `_start` symbol but it's not the entry point. `b _start` will set a breakpoint there, but it won't be the first code to execute! – Peter Cordes Nov 15 '16 at 19:54
  • @PeterCordes: Absolutely, just a free thought I found worthy pointing out. – edmz Nov 15 '16 at 19:56
2

Your main problem is simple. In your short program you are pushing the value 42 on the stack immediately before the RET pops it to jump to it - which causes a segfault, because it jumps to the address 00000042(32-bit-mode), which (in protected mode) causes most likely an exception.

zx485
  • 28,498
  • 28
  • 50
  • 59