Let's take Linux x86-64 as an example.
When a process calls execv("/my/prog", args)
, it makes a system call to the kernel. The kernel uses the args
pointer to locate the argument strings in the process's memory, copies them somewhere else for temporary safekeeping, and then tears down the process's virtual memory. Then it sets up the virtual memory for the new program, and loads its code and data from its binary /new/prog
(actually it just maps it for demand loading, but that's not important).
It also allocates a block of memory to be the new program's stack, and that's where it copies the command line arguments, as well as the environment variables and various other data that needs to be passed to the new program. Here it also sets up the array of argv
pointers, pointing to the strings themselves in the program's stack memory, and pushes the argument count on the stack as well. The precise layout is specified in the ABI, see Figure 3.9.
Now to actually start the program. The binary's header specifies an address to be used as an entry point. The linker will have arranged that this points to a special piece of startup code. This code usually comes with your standard C library, in an object file with a name like crt0.o
. It has been written in assembly, and its job is to process the command line arguments and so forth, set up registers and memory the way that compiled C or C++ code expects, and call a C/C++ function in the standard library which will do further initialization and then call your main
. The kernel jumps to the entry point address, switching to unprivileged mode along the way, and the startup code starts executing.
You can see glibc's version in start.S
, but a very minimal version could look something like this.
; main takes argc in rdi and argv in rsi
; bottom of stack contains argument count
mov rdi, [rsp]
; next is start of the argument pointer array
lea rsi, [rsp+8]
call main
; main returns, exit the program
mov rdi, rax
call exit
; exit() makes an exit system call and doesn't return
So when control actually reaches your main
function, the registers contain the same values as if it had been called by another C++ function. The argv
argument points to an array of pointers on the stack, each of which points to a string located further up in stack memory, as set up by the kernel.