2
000000000040050f <oranges>:
...
40053f: 89 cf             mov %eax,%edi
400541: e8 a7 ff ff ff    callq 4004ed<apples>
400546: 0f af c3          imul %ebx, %eax
...

Oranges calls apples twice. Apples starts at 0x00000000004004ed. However, in the second call to apples in the machine code, the number in the function call is 0xFFFFFFA7

I need to figure out what number the CPU added with 0xFFFFFFA7 to get the address of apple() 0x00000000004004ed

Essentially, I need to do a subtraction problem. How do I subtract 0xFFFFFFA7 and 0x00000000004004ed in order to find what was added to the CPU? What is the right way to convert to make this make sense ?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847

2 Answers2

2

call rel32 is relative to the end of the call instruction.

And the little-endian rel32 is a 2's complement integer, so 0xFFFFFFA7 it's a small negative number. 0xFFFFFFA7 - 2^32 = -89 (decimal), i.e. jump 89 bytes backwards.

You correctly decoded the 4-byte little-endian rel32 displacement into a binary integer, but didn't reinterpret it as signed 2's complement. (It's not subtracted, it's added. That's why it's negative).

e8 a7 ff ff ff callq rel32 ends at address 0x400546 (the start of the next instruction), so that will be RIP during its execution. New RIP after executing it will be
0x400546 - 89 = 0x4004ed, same as objdump -d printed.
objdump of course calculated that address the same way I did.

(Although objdump probably sign-extended the displacement to 64-bit before adding to a 64-bit code address. Working out that bit-pattern 0xFFFFFFA7 means -89 decimal as a 2's complement integer is basically like reading those 4 bytes into an int32_t and adding it to a uint64_t. (Intel's manual for https://www.felixcloutier.com/x86/call also describes the process as sign-extending the rel32 for binary addition, but that's just another way to express the same math in a more machine-friendly way. Other than the sign-extension, all of this works identically in any mode for direct relative call and jmp instructions. jmp rel8 uses an 8-bit 2's complement branch displacement.)


Semi-related: How does $ work in NASM, exactly? has an example of manually encoding a call to a given target address.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
1

At a glance [and loosely] ...

You're using an x86 in 64 bit mode.

64 bit mode has a special addressing mode known as "RIP relative" addressing.

Edit: From Peter, the addressing mode is actually call rel32, rather than RIP relative, although the offset calculations will be the same.

The %rip register is the program counter. It changes on each instruction.

So, when using this mode, the offset is how far away the target address (e.g. apples) is from the address of the current instruction (from the address in %rip for the instruction).

Since you have two callq instructions (from your description, but not shown in the code), they each have a different address, so the offset to apples will be different.

This allows for "position independent code". It also allows an offset to be used, which is usually smaller than a full 64 bit absolute address. That's why the callq instruction (opcode + offset/address) is only 5 bytes (vs. 9 bytes) because the offset is a signed 32 bit quantity.


UPDATE:

I thought rip may be involved. In this particular instance, can you help me decipher how to find the rip% or kind of walk through this specific problem?

You could do: objdump --disassemble myprogram to get a disassembly and look at the disassembly. Or, you could do this with the debugger (e.g. gdb) using the disassemble command.

From your listing, the address of the callq is 0x400541 and [you mentioned that] apples is at 0x4004ed.

So, the offset from the start of the callq instruction is:

-84 FFFFFFFFFFFFFFAC

But, the instruction has an offset of:

0xFFFFFFFA7

(Remember that the disassembly just puts out the bytes, so we have to manually reverse the bytes because the offset is little-endian).

So, this means that the %rip value used is not the start of the instruction, but, rather the end of the instruction.

So, we have to adjust the offset by the length of the instruction [which is 5] to get 0xFFFFFFA7. That is, the %rip value [used by] the callq instruction is the address of the instruction + 5. In pseudo code, the calculation is:

offset = apples - (&callq + 5)
Craig Estey
  • 30,627
  • 4
  • 24
  • 48
  • I thought rip may be involved. In this particular instance, can you help me decipher how to find the rip% or kind of walk through this specific problem? –  Nov 10 '20 at 04:58
  • @craig: `call rel32` is not new for 64-bit mode; the E8 opcode dates back all the way to 8086 (where it was `call rel16` in 16-bit mode of course. [Why do x86 jump/call instructions use relative displacements instead of absolute destinations?](https://stackoverflow.com/q/46184755)). What's new in 64-bit mode is RIP-relative addressing for *data*, like `mov eax, [RIP + symbol]`. (And also that call can't reach the entire address space, because immediates / displacements are still 32-bit, so yes it's sign-extended to 64-bit before adding to the address of the end of the call / jmp instruction) – Peter Cordes Nov 10 '20 at 05:05
  • PIC was possible in 32-bit code as well, with inconveniences only for data access. – Peter Cordes Nov 10 '20 at 05:06
  • @PeterCordes Hi, Peter. I put "loosely" at the top because I suspected you'd be around :-) – Craig Estey Nov 10 '20 at 05:21
  • @PeterCordes So, from your comment [and answer] the instruction is actually `call rel32` rather than `RIP relative`? So, even if the offset calculations are the same, it's a misnomer to call it `RIP relative` here? If so, I'll adjust my answer. – Craig Estey Nov 10 '20 at 05:28
  • Yes, I wouldn't use the phrase "RIP relative". That's almost always used to refer to the ModRM *data* addressing mode that x86-64 added. In a computer science class, if you wanted to talk about the "addressing mode" that branches use, you would say it's PC relative or RIP relative, but only if someone asked about that in the first place. Relative branches are the norm in many ISAs, especially ones wider than 16-bit. In x86 terms, given that memory-indirect branches like `call [rip + global_func_ptr]` are possible, we normally reserve "addressing mode" for *data* accesses. – Peter Cordes Nov 10 '20 at 05:46
  • `call rel32` is a handy way to unambiguously reference one of the encodings for that mnemonic as listed in the manual. https://www.felixcloutier.com/x86/call. You could also say `e8 call` to name it by opcode, or "direct near call" (because the only non-indirect near call is this one. In 16 and 32-bit mode there's a direct far call which takes an absolute `ptr16:32`, but there's no 64-bit form of it). `call rel32` is useful by analogy with jumps: `jmp rel32` and `jmp rel8` both exist, and target calculation works identically to call rel32. I don't know if Intel formally uses these name. – Peter Cordes Nov 10 '20 at 05:48
  • @PeterCordes An FYI ... S.I.W.O.T.I. You may want to see: https://stackoverflow.com/questions/64773308/small-vs-identical-types-of-loop-variables-in-c-c-for-performance#64773308 to explain _why_ `int_fast32_t` or `int_fast16_t` generate the 64 bit variants (e.g. `add $1,%rax`) Or, at least comment and link to one of your existing answers that [IIRC] covers in detail why 64 bit is actually faster [with register renaming, etc.] – Craig Estey Nov 10 '20 at 18:15
  • @CraigEstey: ugh, 64-bit isn't faster than 32-bit. Only 16-bit is sometimes slower. I have a half-finished answer on [Cpp uint32\_fast\_t resolves to uint64\_t but is slower for nearly all operations than a uint32\_t (x86\_64). Why does it resolve to uint64\_t?](https://stackoverflow.com/q/63045795) that I should finish and post; for almost all use-cases, that was a bad choice on gcc/glibc's part. Thanks for the link. – Peter Cordes Nov 10 '20 at 22:50