Why am I allowed to exit main using ret?

Question

I am about to figure out how exactly a programm stack is set up. I have learned that calling the function with

call pointer;

Is effectively the same as:

mov register, pc ;programcounter
add register, 1 ; where 1 is one instruction not 1 byte ...
push register
jump pointer

However, this would mean that when the Unix Kernel calls the main function that the stack base should point to reentry in the kernel function which calls main.

Therefore jumping "*rbp-1" in the C - Code should reenter the main function.

This, however, is not what happens in the following code:

#include <stdlib.h>
#include <unistd.h>

extern void ** rbp(); //pointer to stack pointing to function
int main() {
   void ** p = rbp();
   printf("Main: %p\n", main);
   printf("&Main: %p\n", &main); //WTF
   printf("*Main: %p\n", *main); //WTF
   printf("Stackbasepointer: %p\n", p);
   int (*c)(void) = (*p)-4;
   asm("movq %rax, 0");
   c();

   return 0;        //should never be executed...

}

Assembly file: rsp.asm

...

.intel_syntax

.text:

.global _rbp

_rbp:
  mov rax, rbp
  ret;

This is not allowed, unsurprisingly, maybe because the instruction at this point are not exactly 64 bits, maybe because UNIX does not allow this...

But also this call is not allowed:

   void (*c)(void) = (*p);
   asm("movq %rax, 0"); //Exit code is 11, so now it should be 0
   c(); //this comes with stack corruption, when successful

This means I am not obliged to exit the main - calling function.

My question then is: Why am I when I use ret as seen in the end of every GCC main function?, which should do effectively the same as the code above. How does a unix - system check for such attempts effectively... I hope my question is clear...

Thank you. P.S.: Code compiles only on macOS, change assembly for linux

The question is quite unclear. Short answer: `main` is not much different from any other function. The only two differences are 1: `main` is called automatically by the system upon program start. 2: if there is no `return` statement at the end of `main`, an automatic `return 0;` is performed. — Jabberwocky, Jan 10 '20 at 09:24
Yes, but why does ret work, while jumping to the base pointer is not allowed Since both instructions are efficiently equal... — Niclas, Jan 10 '20 at 09:28
Sorry, I don't get it, why should `ret` not work? Also in a comment you have `//Exit code is 11`: why 11? The `ret` at the end of `main` simply returns to whatever system code alled it. — Jabberwocky, Jan 10 '20 at 09:30
I expect c() to corrupt the kernel... 11 is just random I think... The thing is Ret works while reentering a kernel function, but reentering a kernel function as wishes does not work... This is tricky... How is this assured... — Niclas, Jan 10 '20 at 09:51
@Niclas Right there - your assumption is wrong. `main` is called from within libc, not the kernel itself. The kernel just sets up the environment for the process to run. You can get some libc code executed by messing with the stack pointers, but rest assured there is no way for you to corrupt the kernel that easy. — dragosht, Jan 10 '20 at 10:00
First you say "1 instruction not 1 byte" then you subtract 1 byte, not 1 instruction. Is that a mistake? — user253751, Jan 10 '20 at 10:04
Note: you add one byte, not one instruction (on most [all?] assemblers) — Giacomo Catenazzi, Jan 10 '20 at 10:11
Note: main could have a different stack structure (e.g. to guarantee 0 as default return, to call exit functions (if registered)). The function called by exec* is not the C main function (because C requirement of some setup) — Giacomo Catenazzi, Jan 10 '20 at 10:17
@user253751 Yes this should be wrong. However, I think this does not explain the overall problem... Since rbp does not work either... — Niclas, Jan 10 '20 at 10:27
@GiacomoCatenazzi You are right, adding plus or minus 1 is bullshit... I went into this problem by refactoring my code... No exec is not the C Main function, but I suppose it is in the stack... — Niclas, Jan 10 '20 at 10:29
@GiacomoCatenazzi: the implicit `return 0` at the bottom of `main` is implemented by the compiler, exactly as if it was written explicitly. e.g. `xor %eax,%eax` on x86-64, to zero the return-value register. In most calling conventions for normal machines, integer return values are returned in registers, not on the stack, so there's nothing special you can do in the runtime startup code that would make that work if a compiler generated code that just returned as if main were `void`. Yes the `_start` entry point isn't `main`, but implementing return 0 isn't part of its job! — Peter Cordes, Jan 10 '20 at 16:11
@PeterCordes: yeah, but so main is/was special: for one, it has this zeroing (and done only on main). IIRC older compiler had different main signatures, and compiler adapted according the signature (e.g. std c setup or more native setup) (and it was not always a function). -- ok, your answer cover most of this. — Giacomo Catenazzi, Jan 11 '20 at 08:05
@GiacomoCatenazzi: Yes, `main` is special; the compiler notices and treats it specially *when compiling* (including putting stack-alignment code inside `main` on some targets). The implicit `return 0` is implemented at compile time in main itself, not by anything outside of `main`. You could link with a simpler `_start` for `int main(void)` that doesn't do args, but that's separate from return value. I don't do a lot of embedded stuff, but are you sure there are implementations where the C `main` function isn't a function? In C it's legal for functions to call `main`. (In C++ that's UB) — Peter Cordes, Jan 11 '20 at 08:16

Peter Cordes · Accepted Answer · 2020-01-10T18:41:32.640

C main is called (indirectly) from CRT startup code, not directly from the kernel.

After main returns, that code calls atexit functions to do stuff like flushing stdio buffers, then passes main's return value to a raw _exit system call. Or exit_group which exits all threads.

You make several wrong assumptions, all I think based on a misunderstanding of how kernels work.

The kernel runs at a different privilege level from user-space (ring 0 vs. ring 3 on x86). Even if user-space knew the right address to jump to, it can't jump into kernel code. (And even if it could, it wouldn't be running with kernel privilege level).

ret isn't magic, it's basically just pop %rip and doesn't let you jump anywhere you couldn't jump to with other instructions. Also doesn't change privilege level¹.
Kernel addresses aren't mapped / accessible when user-space code is running; those page-table entries are marked as supervisor-only. (Or they're not mapped at all in kernels that mitigate the Meltdown vulnerability, so entering the kernel goes through a "wrapper" block of code that changes CR3.)

Virtual memory is how the kernel protects itself from user-space. User-space can't modify page tables directly, only by asking the kernel to do it via mmap and mprotect system calls. (And user-space can't execute privileged instructions like mov cr3, rax to install new page tables. That's the purpose of having ring 0 (kernel mode) vs. ring 3 (user mode).)
The kernel stack is separate from the user-space stack for a process. (In the kernel, there's also a small kernel stack for each task (aka thread) that's used during system calls / interrupts while that user-space thread is running. At least that's how Linux does it, IDK about others.)
The kernel doesn't literally call user-space code; The user-space stack doesn't hold any return address back into the kernel. A kernel->user transition involves swapping stack pointers, as well as changing privilege levels. e.g. with an instruction like iret (interrupt-return).

Plus, leaving a kernel code address anywhere user-space can see it would defeat kernel ASLR.

Footnote 1: (The compiler-generated ret will always be a normal near ret, not a retf that could return through a call gate or something to a privileged cs value. x86 handles privilege levels via the low 2 bits of CS but nevermind that. MacOS / Linux don't set up call gates that user-space can use to call into the kernel; that's done with syscall or int 0x80 instructions.)

In a fresh process (after an execve system call replaced the previous process with this PID with a new one), execution begins at the process entry point (usually labeled _start), not at the C main function directly.

C implementations come with CRT (C RunTime) startup code that has (among other things) a hand-written asm implementation of _start which (indirectly) calls main, passing args to main according to the calling convention.

_start itself is not a function. On process entry, RSP points at argc, and above that on the user-space stack is argv[0], argv[1], etc. (i.e. the char *argv[] array is right there by value, and above that the envp array.) _start loads argc into a register and puts pointers to the argv and envp into registers. (The x86-64 System V ABI that MacOS and Linux both use documents all this, including the process-startup environment and the calling convention.)

If you try to ret from _start, you're just going to pop argc into RIP, and then code-fetch from absolute address 1 or 2 (or other small number) will segfault. For example, Nasm segmentation fault on RET in _start shows an attempt to ret from the process entry point (linked without CRT startup code). It has a hand-written _start that just falls through into main.

When you run gcc main.c, the gcc front-end runs multiple other programs (use gcc -v to show details). This is how the CRT startup code gets linked into your process:

gcc preprocesses (CPP) and compiles+assembles main.c to main.o (or a temporary file). On MacOS, the gcc command is actually clang which has a built-in assembler, but real gcc really does compile to asm and then run as on that. (The C preprocessor is built-in to the compiler, though.)
gcc runs something like ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie /usr/lib/Scrt1.o /usr/lib/gcc/x86_64-pc-linux-gnu/9.1.0/crtbeginS.o main.o -lc -lgcc /usr/lib/gcc/x86_64-pc-linux-gnu/9.1.0/crtendS.o. That's actually simplified a lot, with some of the CRT files left out, and paths canonicalized to remove ../../lib parts. Also, it doesn't run ld directly, it runs collect2 which is a wrapper for ld. But anyway, that statically links in those .o CRT files that contain _start and some other stuff, and dynamically links libc (-lc) and libgcc (for GCC helper functions like implementing __int128 multiply and divide with 64-bit registers, in case your program uses those).

.intel_syntax

.text:

.global _rbp

_rbp:
  mov rax, rbp
  ret;

This is not allowed, ...

The only reason that doesn't assemble is because you tried to declare .text: as a label, instead of using the .text directive. If you remove the trailing : it does assemble with clang (which treats .intel_syntax the same as .intel_syntax noprefix).

For GCC / GAS to assemble it, you'd also need the noprefix to tell it that register names aren't prefixed by %. (Yes you can have Intel op dst, src order but still with %rsp register names. No you shouldn't do this!) And of course GNU/Linux doesn't use leading underscores.

Not that it would always do what you want if you called it, though! If you compiled main without optimization (so -fno-omit-frame-pointer was in effect), then yes you'd get a pointer to the stack slot below the return address.

And you definitely use the value incorrectly. (*p)-4; loads the saved RBP value (*p) and then offsets by four 8-byte void-pointers. (Because that's how C pointer math works; *p has type void* because p has type void **).

I think you're trying to get your own return address and re-run the call instruction (in main's caller) that reached main, eventually leading to a stack overflow from pushing more return addresses. In GNU C, use void * __builtin_return_address (0) to get your own return address.

x86 call rel32 instructions are 5 bytes, but the call that called main was probably an indirect call, using a pointer in a register. So it might be a 2-byte call *%rax or a 3-byte call *%r12, you don't know unless you disassemble your caller. (I'd suggest single-stepping by instructions (GDB / LLDB stepi) off the end of main using a debugger in disassembly mode. If it has any symbol info for main's caller, you'll be able to scroll backward and see what the previous instruction was.

If not, you might have to try and see what looks sane; x86 machine code can't be unambiguously decoded backwards because it's variable-length. You can't tell the difference between a byte within an instruction (like an immediate or ModRM) vs. the start of an instruction. It all depends on where you start disassembling from. If you try a few byte offsets, usually only one will produce anything that looks sane.

   asm("movq %rax, 0"); //Exit code is 11, so now it should be 0

This is a store of RAX to absolute address 0, in AT&T syntax. This of course segfaults. exit code 11 is from SIGSEGV, which is signal 11. (Use kill -l to see signal numbers).

Perhaps you wanted mov $0, %eax. Although that's still pointless here, you're about to call through your function pointer. In debug mode, the compiler might load it into RAX and step on your value.

Also, writing a register in an asm statement is never safe when you don't tell the compiler which registers you're modifying (using constraints).

   printf("Main: %p\n", main);
   printf("&Main: %p\n", &main); //WTF

main and &main are the same thing because main is a function. That's just how C syntax works for function names. main isn't an object that can have its address taken. & operator optional in function pointer assignment

It's similar for arrays: the bare name of an array can be assigned to a pointer or passed to functions as a pointer arg. But &array is also the same pointer, same as &array[0]. This is true only for arrays like int array[10], not for pointers like int *ptr; in the latter case the pointer object itself has storage space and can have its own address taken.

score 3 · Answer 2 · 2020-01-10T16:45:55.243

I think there are quite a few misunderstandings you have here. First, main is not what gets called by the kernel. The kernel allocates a process and loads our binary into memory - usually from an ELF file if you are using a Unix-based OS. This ELF file contains all of the sections that need to be mapped into memory and an address that is the "Entry Point" for the code in the ELF(among other things). The ELF can specify any address for the loader to jump to in order to start launching the program. In applications built with GCC, this is a function called _start. _start then sets up the stack and does any other initialization it needs to before calling __libc_start_main which is a libc function that can do additional set up before calling main main.

Here is an example of a start function:

00000000000006c0 <_start>:


 6c0:   31 ed                   xor    %ebp,%ebp
 6c2:   49 89 d1                mov    %rdx,%r9
 6c5:   5e                      pop    %rsi
 6c6:   48 89 e2                mov    %rsp,%rdx
 6c9:   48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
 6cd:   50                      push   %rax
 6ce:   54                      push   %rsp
 6cf:   4c 8d 05 0a 02 00 00    lea    0x20a(%rip),%r8        # 8e0 <__libc_csu_fini>
 6d6:   48 8d 0d 93 01 00 00    lea    0x193(%rip),%rcx        # 870 <__libc_csu_init>
 6dd:   48 8d 3d 7c ff ff ff    lea    -0x84(%rip),%rdi        # 660 <main>
 6e4:   ff 15 f6 08 20 00       callq  *0x2008f6(%rip)        # 200fe0 <__libc_start_main@GLIBC_2.2.5>
 6ea:   f4                      hlt    
 6eb:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

As you can see, this function sets the value of the stack and the stack base pointer. Therefore, there is no valid stack frame in this function. The stack frame is not even set to anything but 0 until you call main (at least by this compiler)

Now what is important to see here is that The stack was initialized in this code, and by the loader, it is not a continuation of the kernel's stack. Each program has its own stack, and these are all different from the kernel's stack. In fact, even if you knew the address of the stack in the kernel, you could not read from it or write to it from your program because your process can only see the pages of memory that have been allocated to it by the MMU which is controlled by the kernel.

Just to clarify, when I said the stack was "created" I did not mean that it was allocated. I only mean that the stack pointer and stack base are set here. The memory for it is allocated when the program is loaded, and pages are added to it as needed whenever a page fault is triggered by a write to an unallocated part of the stack. Upon entering start there is clearly some stack in existence as evidence from the pop rsi instruction however this is not the stack the final stack values that will be used by the program. those are the variables that get set up in _start (maybe these get changed in __libc_start_main later on, I'm not sure.)

Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/205782/discussion-on-answer-by-s-e-why-am-i-allowed-to-exit-main-using-ret). — Samuel Liew, Jan 11 '20 at 03:12

score 1 · Answer 3 · answered Jan 10 '20 at 13:08

1

However, this would mean that when the Unix Kernel calls the main function that the stack base should point to reentry in the kernel function which calls main.

Absolutely not.

This particular question covers the details for MacOS, please have a look. In any case main is most likely returning to start function of the C standard library. Details of implementation differ between different *nix operating systems.

Therefore jumping "*rbp-1" in the C - Code should reenter the main function.

You have no guarantee what the compiler will emit and what will be the state of rsp/rbp when you call rbp() function. You can't make such assumptions.

Btw if you want to access stack entry in 64bit you would do this in +-8 increments (so rbp+8 rbp-8 rsp+8 rsp-8 respectively).

answered Jan 10 '20 at 13:08

Kamil.S

5,205
2
22
51

All x86-64 *nix OSes use the x86-64 System V ABI which does define the process-startup environment for `_start`. A few details may differ between OSes, like I think MacOS includes `_start` in libc itself, while Linux statically links `_start` into the executable and has it call `__libc_start_main` (passing the address of `main` as an arg). – Peter Cordes Jan 10 '20 at 16:40
1

In MacOS `_start` comes from libdyld.dylib which is used by `dyld` to setup the process startup. – Kamil.S Jan 10 '20 at 17:51
On Linux, `/lib/ld.so` does contains a `_start` which is the actual first thing to run in user-space in a dynamically linked executable. But after its work is done, it jumps to the executable's (ELF) entry point (also called `_start`, with the same environment of RSP pointing at `argc`) instead of calling the executable's `main` via libc functions. On GNU/Linux with glibc, the executable's `_start` calls its main (indirectly). MacOS calls the executable's `main` from dyld, I think (via libc?), so the executable itself doesn't contain a block that executes with RSP pointing to `argc`, right? – Peter Cordes Jan 10 '20 at 18:04
libc is not involved and there's no special block in the executable. `_dyld_start` jumps directly to `main`, and `main` returns to `_start` https://opensource.apple.com/source/dyld/dyld-195.5/src/dyldStartup.s.auto.html – Kamil.S Jan 10 '20 at 18:33
Oh, that makes sense. On Linux I think the point of having `__libc_start_main` in libc is so that amount of code doesn't have to get statically linked into every executable. But on MacOS if something equivalent is in `dyld` that's solves the same problem more simply. So after `_dyld_start` calls libc init functions (which has to happen at some point), it makes sense it can just call `main` directly and then run atexit functions. – Peter Cordes Jan 10 '20 at 18:37

Why am I allowed to exit main using ret?

3 Answers3

Linked