2

I have this function pointer and this code:

0x0000555555556e80 <+0>:     push   %rbp
0x0000555555556e81 <+1>:     mov    %rsp,%rbp
0x0000555555556e84 <+4>:     sub    $0x10,%rsp
0x0000555555556e88 <+8>:     movl   $0x0,-0x4(%rbp)
0x0000555555556e8f <+15>:    movslq -0x4(%rbp),%rcx
0x0000555555556e93 <+19>:    lea    0x7406(%rip),%rax        # 0x55555555e2a0 <init_functions>
0x0000555555556e9a <+26>:    cmpq   $0x0,(%rax,%rcx,8)
0x0000555555556e9f <+31>:    je     0x555555556ec1 <initialize_bomb+65>
0x0000555555556ea5 <+37>:    movslq -0x4(%rbp),%rcx
0x0000555555556ea9 <+41>:    lea    0x73f0(%rip),%rax        # 0x55555555e2a0 <init_functions>
0x0000555555556eb0 <+48>:    call   *(%rax,%rcx,8)
0x0000555555556eb3 <+51>:    mov    -0x4(%rbp),%eax
0x0000555555556eb6 <+54>:    add    $0x1,%eax
0x0000555555556eb9 <+57>:    mov    %eax,-0x4(%rbp)
0x0000555555556ebc <+60>:    jmp    0x555555556e8f <initialize_bomb+15>
0x0000555555556ec1 <+65>:    add    $0x10,%rsp
0x0000555555556ec5 <+69>:    pop    %rbp
0x0000555555556ec6 <+70>:    ret    

It's a loop that iterates 11 times, and I really don't know how to handle the function pointer. When there's the call *(%rax,%rcx,8), the two registers involved (RAX and RCX) change but I don't understand how or why, since I don't know what happens in that call...

I can't put breakpoints. I don't know what to do.

Sep Roland
  • 33,889
  • 7
  • 43
  • 76

3 Answers3

3

call *address loads a function pointer from memory into RIP, using standard AT&T syntax for an addressing mode (or register name) following the *. See What does an asterisk * before an address mean in x86-64 AT&T assembly? So this pushes a return address then loads a new RIP from the address [rax + rcx*8].

The call *foo syntax (EIP/RIP = dword/qword loaded from memory at foo, memory-indirect) has an asterisk to disambiguate from call foo (RIP = address of foo, a direct call rell32), in case you were using just a bare symbol name as the addressing mode.

In 64-bit mode you'd normally use call *foo(%rip) for a static function pointer that wasn't in an array, but AT&T syntax was designed long before x86-64 existed, and 64-bit mode would still have that ambiguity. (In all other cases, GAS will warn if you leave out the *, and infer that you meant an indirect jump/call if you write something like call (%rax) or call %rax.)


RAX and RCX are call-clobbered so it's normal they don't keep their value; notice how their values before call come from an LEA and a load from a local on the stack. (What registers are preserved through a linux x86-64 function call)

If you want to see what functions are called, use GDB stepi (aka si) to single-step into the call. (Put a breakpoint somewhere in this function so you can single step from there.)


If you want to understand the loop, look at the code surrounding the call.

A RIP-relative LEA puts a constant address into RAX; as fjs points out, there's a symbol name init_functions.

RCX is loaded from a local var on the stack, with sign-extension from 32-bit to 64. Looking at the surrounding code, this is clearly a loop counter, initialized to zero earlier in the function. Presumably an int.

Before the call, the same array indexing is done to check if it's a NULL pointer. This is clearly debug-mode compiler output, where each C statement is compiled to a separate block of asm. That means you only have to look locally to see what a block is doing in isolation, but it leads to much more code than would be necessary, e.g. two accesses to the array and redoing sign-extension of the loop counter each time.

Some things like this are simple enough that the whole loop is easy to follow in an optimized build. Well, easy enough; GCC does rotate the loop so the condition can be at the bottom, partially peeling it. And checks the first condition before saving RBX (shrink-wrap optimization) which it uses to hold a pointer into the array. (Instead of using a pointer and separate integer index).

extern void (*init_functions[])();

void init(){
    for(int i=0 ; init_functions[i]  ; i++) 
        init_functions[i]();
}

Godbolt

init():
        movq    init_functions(%rip), %rax   # partially peeled first iteration
        testq   %rax, %rax
        je      .L9
        pushq   %rbx
        leaq    8+init_functions(%rip), %rbx      # fptr = &init_functions[1]
          # enter the loop with RAX holding first array entry
.L3:                         # do{
        call    *%rax
        movq    (%rbx), %rax    # load the next 
        addq    $8, %rbx        # fptr++
        testq   %rax, %rax      # and test it
        jne     .L3          # }while( *fptr != 0 )
        popq    %rbx
        ret
.L9:
        ret         # silly compiler, no need for tail duplication here.
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
2

The comment after this instruction tells you the value of rax: <+41>: lea 0x73f0(%rip),%rax # 0x55555555e2a0 <init_functions>

Regarding %rcx, look at your disassembly code everywhere you either load from or store to -0x4(%rbp). It's a local variable used as the loop counter, initialized as 0 and incremented by 1 every iteration. The code could well look like this: for(int idx=0; init_functions[idx]!=NULL; ++idx) init_functions[idx](); The function pointers are each 8 bytes long, so that why you call into %rax + 0x8 * %rcx

(Thanks to Peter Cordes' suggestion on the meaning of %rcx.)

fjs
  • 330
  • 2
  • 9
  • 2
    Close, but RCX isn't loaded from a function arg. It's *below* the frame pointer (RBP) in this debug-mode compiler output; it's a local var, a loop counter initialized with `movl $0x0,-0x4(%rbp)` and updated inside the loop. Also used as the loop-exit condition, when it finds a NULL pointer. (The code would be easier to follow in an optimized build, but it's `for(int i=0 ; init_functions[i] != NULL ; i++) init_functions[i]();` done with two accesses to the array and redoing sign-extension of `i` each time. https://godbolt.org/z/hb78MYhqK) – Peter Cordes Dec 11 '22 at 17:20
  • Feel free to edit your answer to be correct instead of just saying to see comments, that's why I commented. Note that the first six function args are passed in registers, so no, function args aren't usually passed on the stack. The LEA part is right. – Peter Cordes Dec 11 '22 at 17:29
0

The jump address is %rax+%rcx*8, whatever that is in your circumstance. And note that %rax is %rip-relative and %rcx is getting loaded with something from the stack.

SoronelHaetir
  • 14,104
  • 1
  • 12
  • 23