11

The address of executable code is decided at link time, isn't it?

#include <stdio.h>
int main ()
{
     printf("%p", (void*)&main);
     return 0;
}

example output #1:

0x563ac3667139

example output #2:

0x55e3903a9139
Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • 25
    [ASLR](https://en.wikipedia.org/wiki/Address_space_layout_randomization). – Lundin Jan 10 '20 at 14:03
  • 1
    Strictly speaking the code provokes UB, as the `p` conversion specifier is for `void`-pointers only (in C at least). So any number might get printed. – alk Jan 10 '20 at 14:07
  • 1
    @alk You can of course perform an explicit cast to (void *) in the `printf()`, then its no longer UB. But still any number might get printed... Now, what does this mean? – Ctx Jan 10 '20 at 14:18
  • 2
    @Ctx: Still UB in C++. Can't take the address of `main`. – Lightness Races in Orbit Jan 10 '20 at 14:19
  • @LightnessRacesBY-SA3.0 In C++ this is indisputable, correct – Ctx Jan 10 '20 at 14:21
  • @Ctx Even if it wasn't `main`, casting a function pointer to an object pointer type would be conditionally-supported only (and as you imply with implementation-defined resulting value). It will be supported on POSIX-compliant platforms at least, though. (in C++) – walnut Jan 10 '20 at 15:18
  • @LightnessRacesBY-SA3.0 "Can't take the address of main" - can you explain? – shargors Jan 10 '20 at 15:36
  • @shargors https://stackoverflow.com/questions/28567869/address-of-function-main-in-c-c – walnut Jan 10 '20 at 15:39
  • Indeed, ironically I did five years ago ;) – Lightness Races in Orbit Jan 10 '20 at 17:25

2 Answers2

19

On many modern systems, at link time it will determine the address of the function relative to the base address module. When the module (exe, dll, or so) is loaded, Address Space Layout Randomization (ASLR) gives it a different base address.

This is for security, it means the addresses of functions is not predictable. This means certain attacks that might for example overflow a stack variable to overwrite the return address or a function pointer with some other function (for malicious purposes), can't easily predict what address to overwrite it with, it will vary from run to run.

The ability to relocate the base address also solves the practical problem of a conflict, if you load a.dll and b.dll which were independently compiled for the same base address, that won't work, so being able to relocate one resolves the conflict.

At the machine code level, this is fine because most jumps and calls use a relative instruction offset, not an absolute. Although certain constructs are dynamically patched when the module is loaded, or use some form of "table" that is populated with the correct addresses.

See also Relocation (computing)

Fire Lancer
  • 29,364
  • 31
  • 116
  • 182
9

This is a security technique called address space layout randomization.

It deliberately moves things around on each execution, to make it more difficult for attackers to know where bits of data are in your process and hack them.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • Isn't it address of loaded location? Till date I thought it is real address were program mounted? – TruthSeeker Jan 10 '20 at 14:16
  • 2
    @TruthSeeker this method only works since the introduction of ̵C̵h̵e̵e̵s̵e̵c̵a̵k̵e̵ PIE (position independent executables). Use the flag `-no-pie`, this will load the main binary at a fixed location and the address of `main()` will no longer vary. – Ctx Jan 10 '20 at 14:25