0

I want to create my own linker and loader. I know that in the linking stage the linker will take into consideration the relocation data in the ELF header for all the object files.

The linker then will create an executable file with all the addresses resolved and will store it in the hard drive.

When the time comes the loader will have to load that executable in main memory but the memmory already contain running programs so there will be conflicts.

Question1: Must the loader relocate the addresses all over again?
Question2: If yes, does that mean that the loader must scan all the text sectors of the executable and change the addresses of all cpu instructions??*

*that means that the loader have a copy of the ISA in memory and must scan instruction per instruction. It's like an execution before the execution.

1 Answers1

1

There are no relocation data in the ELF header. Linkable ELF object files store relocation data in subservient sections named .rela.text, .rela.data etc. Static linker on Linux will choose the starting address where the executable image will be loaded (usually 0x08048000) and then it uses relocations to update instructions and data in code and data sections. After those .rela.text and .rela.data have been handled, subservient .rela section are no longer needed and may be stripped off the final ELF executable file.

When the time comes to load the linked executable file in memory, loader creates a new process in protected mode. All virtual address space is assigned to the process and it is unoccupied. Other programs may be loaded in the same computer but they run happily each in their private addressing space.

The scenario you're afraid of sometimes happens on Windows, when different dynamic libraries were linked to start at conflicting virtual address. Therefore Portable executable format (PE/DLL) keeps relocation records in subservient section .reloc and yes, the loader must relocate all addresses mentioned in this section then.

Similar situation is on DOS in real mode, where there is only one 1 MiB address space common for all processes. MZ executables are linked to virtual address 0 and all adresses which require relocation are kept in Relocation pointer table following the MZ EXE header, and the loader is responsible for updating segment addresses mentioned in this pointer table.

Answer1: Relocation is necessary only if the executable image is loaded at different address that it was linked to, and if it is not linked to Position-Independed Executable.
Answer2: Relocation does not concern addresses of all CPU instruction, only those fields in instruction body (displacement or immediate address) which refer to an address. Such places must be explicitly specified in relocation records. If the relocation information was stripped off the file, your loader should refuse execution.

Good source of information: Linkers and Blog by Ian Lance Taylor.

vitsoft
  • 5,515
  • 1
  • 18
  • 31
  • Thank you, and a follow up question: Does the cpu provide an interface for virtual memory? Example: lets say that a jump instruction set Program Counter at address 0x0001 but the program itself is loaded on memory address 0x0100 so the actual jump address is 0x0101. Is there a register that works as a base? (used as an offset) – Boliotis Manousos Jun 27 '22 at 10:42
  • @BoliotisManousos CPU boots in real mode and of course it is able to switch to [protected mode](https://stackoverflow.com/questions/5211541/bootloader-switching-processor-to-protected-mode) and use virtual memory then. Do you intend to write your own OS? Linker and loader of executable files already run in protected mode. Your example seems to assume real mode with segment register `CS` pointing to the start of image in memory and `IP` used as an offset. See [loading MZ EXE](http://www.techhelpmanual.com/354-exe_file_header_layout.html) for more info. – vitsoft Jun 27 '22 at 11:01
  • 1
    First I want to create a compiler, asembler and linker for a custom language. When this is ready i want to try and create a simple OS with this language. I love system programming and i want to try some things but because system programming is not the most popular topic these days (framework this, framework that) i cant find reliable informations. – Boliotis Manousos Jun 27 '22 at 11:13
  • 2
    Afaik Windows since XP can cache relocated binaries to avoid repeated relocations. This was part of the startup time optimization going from w2k to XP. "Linkers and Loaders" by John R Levine (comp.compilers moderator) is also a good online resource – Marco van de Voort Jun 27 '22 at 12:23