How does the linker find the main function in an x86-64 ELF-format executable?
-
see [How main() function inside a shared object (.so) is taken care of by linker](http://stackoverflow.com/questions/9807194/how-main-function-inside-a-shared-object-so-is-taken-care-of-by-linker) – Janus Troelsen Jul 17 '13 at 19:35
-
1did you see [How does C++ linking work in practice?](http://stackoverflow.com/a/12122534/309483)? which part do you not understand? – Janus Troelsen Jul 17 '13 at 19:37
-
I don't understand how it can determine the memory address of the main function. – RouteMapper Jul 17 '13 at 19:46
-
It doesn't look like you figured out yet what a map file means. Google "gcc map file" to learn more. – Hans Passant Jul 17 '13 at 20:38
-
@HansPassant - Don't you need the source code in order to generate a map file? – RouteMapper Jul 18 '13 at 16:54
-
No, linkers don't use source code. – Hans Passant Jul 18 '13 at 16:58
-
Surely there is an easier way to do this than to create a map. I just need to know how the linker computes the call to `main`. What math does it use? – RouteMapper Jul 18 '13 at 17:00
2 Answers
A very generic overview, the linker assigns the address to the block of code identified by the symbol main
. As it does for all the symbols in your object files.
Actually, it doesn't assign a real address but assigns an address relative to some base which will get translated to a real address by the loader when the program is executed.
The actual entry point is not likely main
but some symbol in the crt that calls main. LD by default looks for the symbol start
unless you specify something different.
The linked code ends up in the .text
section of the executable and could look something like this (very simplified):
Address | Code
1000 someFunction
...
2000 start
2001 call 3000
...
3000 main
...
When the linker writes the ELF header it would specify the entry point as address 2000.
You can get the relative address of main
by dumping the contents of the executable with something like objdump
. To get the actual address at runtime you can just read the symbol funcptr ptr = main;
where funcptr
is defined as a pointer to a function with the signature of main
.
typedef int (*funcptr)(int argc, char* argv[]);
int main(int argc, char* argv[])
{
funcptr ptr = main;
printf("%p\n", ptr);
return 0;
}
The address of main will be correctly resolved regardless if symbols have been stripped since the linker will first resolve the symbol main
to its relative address.
Use objdump like this:
$ objdump -f funcptr.exe
funcptr.exe: file format pei-i386
architecture: i386, flags 0x0000013a:
EXEC_P, HAS_DEBUG, HAS_SYMS, HAS_LOCALS, D_PAGED
start address 0x00401000
Looking for main
specifically, on my machine I get this:
$ objdump -D funcptr.exe | grep main
40102c: e8 af 01 00 00 call 4011e0 <_cygwin_premain0>
401048: e8 a3 01 00 00 call 4011f0 <_cygwin_premain1>
401064: e8 97 01 00 00 call 401200 <_cygwin_premain2>
401080: e8 8b 01 00 00 call 401210 <_cygwin_premain3>
00401170 <_main>:
401179: e8 a2 00 00 00 call 401220 <___main>
004011e0 <_cygwin_premain0>:
004011f0 <_cygwin_premain1>:
00401200 <_cygwin_premain2>:
00401210 <_cygwin_premain3>:
00401220 <___main>:
Note that I am on Windows using Cygwin so your results will differ slightly. It looks like main
lives at 00401170
for me.

- 8,002
- 3
- 33
- 52
-
So there's no way to determine what the address of `main` will be before runtime? – RouteMapper Jul 17 '13 at 20:01
-
I see your edited post now. I understand that `start`'s address is in the ELF header. But is there any way to statically compute the address of `main`? Sometimes the address for main is dynamically relocated. – RouteMapper Jul 17 '13 at 20:11
-
@RouteMapper you're doing static analysis, right? Relocation does not exist then. It's the loader's job to relocate, and *you're the loader*, just decide not to relocate. – harold Jul 17 '13 at 20:13
-
You can get the relative address of `main` by dumping the contents of the exe with something like `objdump`. To get the actual address at runtime you can just read the symbol `funcptr ptr = main` where `funcptr` is defined as a pointer to a function with the signature of `main`. – Dave Rager Jul 17 '13 at 20:17
-
Suppose it's a stripped executable. That symbol information no longer exists. How then can I find the address of that function which `start` calls (i.e. `main`')? – RouteMapper Jul 17 '13 at 20:20
-
I'm not sure how to determine the relative address of `main`. What function in `objdump` do you use find it? – RouteMapper Jul 17 '13 at 20:46
-
@DaveRager - That only works if the executable isn't stripped, right? – RouteMapper Jul 17 '13 at 21:06
-
Yes. If the executable is stripped you will have to do it the hard way starting at the entry point (which you can find) and tracing call statements until you find what seems to be your main function. – Dave Rager Jul 17 '13 at 21:09
-
I did trace it, but it gets to the GOT, which hasn't been initialized. As such, I can't determine where `main` is. – RouteMapper Jul 17 '13 at 21:32
On Binutils, it is determined by either:
-e
CLI option- linker script
You can view your linker script with:
ld --verbose
Mine contains:
ENTRY(_start)
Then at link time, glibc provided object files like crt1.o
that contain the _start
symbol are passed to the linker together with your main.o
.
Those object files do some setup for you like argv
, and then call your main
function.
You can see those extra object files being sneaked in with gcc -v
.
This is documented at: https://sourceware.org/binutils/docs/ld/Entry-Point.html#Entry-Point
The first instruction to execute in a program is called the entry point. You can use the ENTRY linker script command to set the entry point. The argument is a symbol name:
ENTRY(symbol)
There are several ways to set the entry point. The linker will set the entry point by trying each of the following methods in order, and stopping when one of them succeeds:
- the `-e' entry command-line option;
- the ENTRY(symbol) command in a linker script;
- the value of a target specific symbol, if it is defined; For many targets this is start, but PE and BeOS based systems for example check a list of possible entry symbols, matching the first one found.
- the address of the first byte of the `.text' section, if present;
- The address 0.
See also: is there a GCC compiler/linker option to change the name of main?

- 347,512
- 102
- 1,199
- 985