Basically, after linking, ELF file format provide all necessary information for the loaders to load the program into memory and run it.
Each piece of code and data is placed within an offset inside a section, like data section, text section, etc. and access of specific function or global variable is done by adding the proper offset to the section start address.
Now, ELF file format also include program header table:
An executable or shared object file's program header table is an array
of structures, each describing a segment or other information that the
system needs to prepare the program for execution. An object file
segment contains one or more sections, as described in "Segment
Contents".
Those structures are then used by the OS loader to load the image to memory. The structure:
typedef struct {
Elf32_Word p_type;
Elf32_Off p_offset;
Elf32_Addr p_vaddr;
Elf32_Addr p_paddr;
Elf32_Word p_filesz;
Elf32_Word p_memsz;
Elf32_Word p_flags;
Elf32_Word p_align;
} Elf32_Phdr;
Note the following fields:
p_vaddr
The virtual address at which the first byte of the segment resides in memory
p_offset
The offset from the beginning of the file at which the first byte of
the segment resides.
And p_type
The kind of segment this array element describes or how to interpret the array element's information. Type values and their meanings are specified in Table 7-35.
From Table 7-35, note PT_LOAD
:
Specifies a loadable segment, described by p_filesz and p_memsz. The
bytes from the file are mapped to the beginning of the memory segment.
If the segment's memory size (p_memsz) is larger than the file size
(p_filesz), the extra bytes are defined to hold the value 0 and to
follow the segment's initialized area. The file size can not be larger
than the memory size. Loadable segment entries in the program header
table appear in ascending order, sorted on the p_vaddr member.
So, by looking at those fields (and more) the loader can locate the segments (which can contain multiple sections) within the ELF file, and load them (PT_LOAD
) into memory at a given virtual address.
Now, can a virtual address of an ELF file segment be changed at runtime (load time)? yes:
The virtual addresses in the program headers might not represent the
actual virtual addresses of the program's memory image. See "Program
Loading (Processor-Specific)".
So, program header contains the segments the OS loader will load into memory (loadable segments, which contains loadable sections), but the virtual addresses the loader puts them can differ from the addresses in the ELF file.
How?
To understand it, lets first read about Base Address
Executable and shared object files have a base address, which is the
lowest virtual address associated with the memory image of the
program's object file. One use of the base address is to relocate the
memory image of the program during dynamic linking.
An executable or shared object file's base address is calculated
during execution from three values: the memory load address, the
maximum page size, and the lowest virtual address of a program's
loadable segment. The virtual addresses in the program headers might
not represent the actual virtual addresses of the program's memory
image. See "Program Loading (Processor-Specific)".
So the practice is the following:
position-independent code. This code enables a segment's virtual
address change from one process to another, without invalidating
execution behavior.
Though the system chooses virtual addresses for individual processes,
it maintains the relative positions of the segments. Because
position-independent code uses relative addressing between segments,
the difference between virtual addresses in memory must match the
difference between virtual addresses in the file.
So by using relative addressing, (PIE- position independent executable) the actual placement can differ from the address in the ELF file.
From PeterCordes
's answer:
0x400000
is the Linux default base address for loading PIE
executables with ASLR disabled (like GDB does by default).
So for your specific case (PIE executable in Linux) loader picks this base address
.
Of course position independent is just an option. Program can be compiled without it, and than absolute addressing mode takes place, in which there must not be difference between segment address in ELF to the real memory address segment is loaded to:
Executable file segments typically contain absolute
code. For the process to execute correctly, the segments must reside
at the virtual addresses used to create the executable file. The
system uses the p_vaddr values unchanged as virtual addresses.
I would recommend you to take a look at the linux implementation of elf image loading here, and those two SO threads here and here.
Paragraphs takes from Oracle ELF documents (here and here)