0

The MIPS jump register (JR) instruction is frequently found in the binary of C++ code. As a result, what feature in C++ makes use of JR instructions and why does it use these instructions?

infinite_l88p
  • 31
  • 1
  • 5
  • 4
    That's not really how languages work. You don't have some kind of simplistic mapping of Feature X causes instruction Y to be used. The language defines what should happen, and the compiler is written to generate machine code in accord with those rules. Why any particular instruction gets used at a particular time is usually a complex matter of what is the most efficient way to implement the various code being compiled. And this stuff gets incredibly complex. – Nicol Bolas Oct 20 '19 at 22:54

1 Answers1

1

Branch instructions can only be used for cases where the target address is known at compile time and is within a small range of the current instruction. You can't (easily) use it for branching to an address that isn't known statically and must be calculated/loaded at run time, or to jump to a target too far away

So here are some examples where JR or JALR must be used (both are exactly the same except JALR stores the current address to return later):

  • Jumping to arbitrary addresses: Static branch instructions can't be used to jump to a 32-bit or 64-bit address because the immediate is only 16 or 26 bits long. You need to load the full address in a register and jump with JR/JALR

  • Function pointers: The calling function is only known at run time, so obviously you need some way to call it dynamically

    int Add(int a, int b);
    int Sub(int a, int b);
    int Mul(int a, int b);
    int Div(int a, int b);
    int (*p[4]) (int x, int y) = { Add, Sub, Mul, Div };
    
    int test_function_pointer(int i, int x, int y) {
        return p[i](x, y);
    }
    

    Functions in shared libraries (*.dll, *.so...) are also unknown to the processes before they're loaded, so if you load those libraries manually (with LoadLibrary(), dlopen()...) you'll also get the addresses to a function pointer and call them with JR/JALR. Typically a function will be called with JALR but if it's at the end of a function and tail-call optimization is enabled then JR will be used instead

    Vtable in many OOP languages like C++ is also an example of function pointer

    struct A {
        virtual int getValue() = 0;
    };
    
    int test_vtable(A *a) {
        return a->getValue() + 1;
    }
    

    Demo on Godbolt's Compiler Explorer

  • Jump table (like in a big switch block)

    typedef int (*func)(int);
    
    int doSomething(func f, int x, int y)
    {
        switch(x)
        {
            case 0:
                return f(x + y);
            case 1:
                return f(x + 2*y);
            case 2:
                return f(2*x + y);
            case 3:
                return f(x - y);
            case 4:
                return f(3*x + y);
            case 5:
                return f(x * y);
            case 6:
                return f(x);
            case 7:
                return f(y);
            default:
                return 3;
        }
    }
    

    GCC compiles the above code to

    doSomething(int (*)(int), int, int):
            sltu    $2,$5,8
            beq     $2,$0,$L2 # x >= 8: default case
            move    $25,$4
    
            lui     $2,%hi($L4)
            addiu   $2,$2,%lo($L4)  # load address of $L4 to $2
            sll     $5,$5,2         # effective address = $L4 + x*4
            addu    $5,$2,$5
            lw      $2,0($5)
            nop
            j       $2
            nop
    
    $L4:
            .word   $L11
            .word   $L5
            .word   $L6
            .word   $L7
            .word   $L8
            .word   $L9
            .word   $L10
            .word   $L11
    $L11:
            jr      $25
            move    $4,$6
    
    $L9:
            sll     $4,$6,2
            jr      $25
            addu    $4,$4,$6
    # ... many more cases below
    

    You can see the full output on Compiler Explorer

    $L4 is a jump table containing the address of the place you're branching to, which is the case blocks in this snippet. Its address is stored in $2 and jr needs to be used to move the instruction pointer to that address. j $2 is shown above but I think it's a disassembler bug, since j can't receive a register operand. Once you're at the correct case then jr is again used to call the f function pointer

See also Necessity of J vs. JAL (and JR vs. JALR) in MIPS assembly

phuclv
  • 37,963
  • 15
  • 156
  • 475