I am building a RISC-V emulator which basically loads a whole ELF file into memory.
Up to now, I used the pre-compiled test binaries that the risc-v foundation provided which conveniently had an entry point exactly at the start of the .text
section.
For example:
> riscv32-unknown-elf-objdump ../riscv32i-emulator/tests/simple -d
../riscv32i-emulator/tests/simple: file format elf32-littleriscv
Disassembly of section .text.init:
80000000 <_start>:
80000000: 0480006f j 80000048 <reset_vector>
...
Going into this project I didn't know much about ELF files so I just assumed that every ELF's entry point is exactly the same as the start of the .text
section.
The problem arose when I compiled my own binaries, I found out that the actual entry point is not always the same as the start of the .text
section, but it might be anywhere inside it, like here:
> riscv32-unknown-elf-objdump a.out -d
a.out: file format elf32-littleriscv
Disassembly of section .text:
00010074 <register_fini>:
10074: 00000793 li a5,0
10078: 00078863 beqz a5,10088 <register_fini+0x14>
1007c: 00010537 lui a0,0x10
10080: 43850513 addi a0,a0,1080 # 10438 <__libc_fini_array>
10084: 3a00006f j 10424 <atexit>
10088: 00008067 ret
0001008c <_start>:
1008c: 00002197 auipc gp,0x2
10090: cec18193 addi gp,gp,-788 # 11d78 <__global_pointer$>
...
So, after reading more about ELF files, I found out that the actual entry point address is provided by the Entry
entry on the ELF's header:
> riscv32-unknown-elf-readelf a.out -h | grep Entry
Entry point address: 0x1008c
The problem now becomes that this address is not the actual address on the file (offset from 0) but is a virtual address, so obviously if I set the program counter of my emulator to this address, the emulator would crash.
Reading a bit more, I heard people talk about calculations regarding offsets from program headers and whatnot, but no one had a concrete answer.
My question is: what is the actual "formula" of how exactly you get the entry point address of the _start
procedure as an offset from byte 0?
Just to be clear my emulator doesn't support virtual memory and the binary is the only thing that is loaded into my emulator's memory, so I have no use for the abstraction of virtual memory. I just want every memory address as physical address on disk.