0

Take the following assembly program:

_start:
    mov myvar,       %rax
    mov myvar(%rip), %rax
    mov myvar(%rip), %rax
    mov myvar(%rip), %rax
    mov myvar(%rip), %rax

The produces the following when run in gdb:

!0x00000000004000b0  ? mov    0x600107,%rax
 0x00000000004000b8  ? mov    0x200048(%rip),%rax        # 0x600107
 0x00000000004000bf  ? mov    0x200041(%rip),%rax        # 0x600107
 0x00000000004000c6  ? mov    0x20003a(%rip),%rax        # 0x600107
 0x00000000004000cd  ? mov    0x200033(%rip),%rax        # 0x600107

Of course, it's not surprising that all the myvar references resolves to 0x600107. Where (or perhaps when is a better question) do the %rip-relative items get resolved to an actual address? How does that process work at a high level?

Related: Why does this MOVSS instruction use RIP-relative addressing?.

carl.hiass
  • 1,526
  • 1
  • 6
  • 26
  • It's resolved at run-time by the CPU, that's why it's a special addressing mode. If it was just a source-level thing, it wouldn't need separate syntax. – Peter Cordes Aug 25 '20 at 06:10
  • See [what does "mov offset(%rip), %rax" do?](https://stackoverflow.com/q/29421766). Also [Can rip be used with another register with RIP-relative addressing?](https://stackoverflow.com/q/48124293) explains some details about the machine-code encoding. Also https://wiki.osdev.org/X86-64_Instruction_Encoding#32.2F64-bit_addressing. [How do RIP-relative variable references like "\[RIP + \_a\]" in x86-64 GAS Intel-syntax work?](https://stackoverflow.com/q/54745872) has some examples of the machine code. – Peter Cordes Aug 25 '20 at 06:18
  • Anyway, are you asking how the CPU decodes machine code that uses a RIP-relative address? Or how the tools (e.g. assembler or linker) figure out the right relative address in the first place? – Peter Cordes Aug 25 '20 at 06:20
  • @PeterCordes I would say a bit of both, but if the assembler/linker basically 'defers' it to the cpu then yea that would be really cool to understand (at a basic level) how that works and is decoded. – carl.hiass Aug 25 '20 at 06:35
  • 1
    Go read https://wiki.osdev.org/X86-64_Instruction_Encoding#32.2F64-bit_addressing then. x86-64 repurposed one of the 2 redundant ways to encode `[disp32]` (with/without SIB); the shorter one (without a SIB) means `[rip+rel32]` instead of `[disp32]`. The CPU knows the address of the end of the instruction so it can do the math. – Peter Cordes Aug 25 '20 at 06:43

0 Answers0