32

What does the following line mean:

...
401147: ff 24 c5 80 26 40 00    jmpq   *0x402680(,%rax,8)
...

What does the asterisk in front of the memory address mean? Also, what does it mean when the memory access method is missing it's first register value?

Usually its something like ("%register", %rax, 8), but in this case it doesn't have the first register.

Any tips?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
de1337ed
  • 3,113
  • 12
  • 37
  • 55

7 Answers7

22

It's AT&T assembly syntax:

  • source comes before destination
  • mnemonic suffixes indicate the size of the operands (q for quad, etc.)
  • registers are prefixed with % and immediate values with $
  • effective addresses are in the form DISP(BASE, INDEX, SCALE) (DISP + BASE + INDEX * SCALE)
  • Indirect jump/call operands indicated with * (as opposed to direct).

So, you have a jmpq for jumping to the absolute address which is stored in %rax * 8 + 0x402680, and is a quad word long.


AT&T syntax needed a way to distinguish RIP = foo (jmp foo) from RIP = load from some symbol address (jmp *foo). Remember that movl $1, foo is a store to the absolute address foo.

With other addressing modes, there's no ambiguity between what kind of jump / call you're doing, anything other than a bare label must be indirect. (GAS will infer that and warn about an indirect jump without * if you do jmp %rax or jmp 24(%rax) or anything other than a bare symbol name.)

(In 64-bit mode you'd normally actually use jmp *foo(%rip) to load a global variable into RIP, not use a 32-bit absolute address like jmp *foo. But the possibility exists, and before x86-64 when AT&T syntax was designed, was the normal way to do things.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Michael Foukarakis
  • 39,737
  • 6
  • 87
  • 123
  • but does the asterisks in front of it make a difference? I understand that what you said makes sense otherwise. I was thinking that first the asterisk will take the data out of memory location 0x402680. So basicaly, it will become %rax*8+mem[0x402680] – de1337ed Feb 10 '12 at 07:49
  • 2
    The asterisk only specifies that's an absolute jump. `jmp` will take the data out of the memory location specified regardless. – Michael Foukarakis Feb 10 '12 at 07:58
  • 1
    `*` actually means *indirect*, like the C dereference operator. Consider `jmp foo` (RIP=foo) vs. `jmp *foo` (RIP = contents of memory at foo). Of course normally you use RIP-relative addressing to access the code-pointer variable foo, like `jmp *foo(%rip)`, but the `foo` vs. `*foo` nicely shows why AT&T syntax needed to disambiguate with a `*`. – Peter Cordes May 20 '21 at 21:13
18

Actually this is computed table jmp, where the 0x402680 is address of tabele and rax is index of 8 byte (qword) pointer.

GJ.
  • 10,810
  • 2
  • 45
  • 62
  • 3
    Jump tables are often used in assembly code when the C code is completing if-else or switch statements. It allows the control to be passed in constant time, instead of having to check a lot of individual equality checks. – Eagle Sep 21 '12 at 06:07
10

Getting things into Intel syntax always makes stuff clearer:

FF24C5 80264000  JMP QWORD PTR [RAX*8+402680]
Necrolis
  • 25,836
  • 3
  • 63
  • 101
  • whoever down voted, care to explain why? Just being anti-intel syntax is *not* a reason... – Necrolis Feb 10 '12 at 10:03
  • 4
    @MichaelFoukarakis: other than repeating the math verbatim in English, there is not much to really add when using intel syntax. – Necrolis Feb 10 '12 at 15:48
  • Doesn't answer the question. – Michael O Apr 20 '21 at 01:21
  • @MichaelO Its actually the most succinct and direct answer you can possibly give. If you look at the highest rated reply here, its literally a bullet point list of Intel mnemonics read left to right... – Necrolis Apr 20 '21 at 03:19
5

jmpq is just a un-conditional jump to a given address. The 'q' means that we're dealing with quad words (64 bits long).

*0x402680(,%rax,8) : This is a way to write an address in x-86 assembly. You are correct in saying that usually there is a register before the first comma, but you still follow the same rules if no register is specified.

The format works this way : D(reg1, reg2, scalingFactor) where D stands for displacement. Displacement is basically just an integer. reg1 is the first or base register. reg2 is the second register and scalingFactor is one of 2, 4, 8 (maybe even 1, but I'm not sure about that). Now, you can obtain your address by simply adding the values in this way: Displacement + (value at reg1) + scalingFactor*(value at reg2).

I'm not completely sure as to what the asterisk in front of the address is for, but my guess is that it means that the displacement value is stored at that address.

Hope this helps.

vyb
  • 79
  • 1
  • 5
  • `*` means its an indirect jump, setting RIP=value from memory or register. e.g. `jmp foo` sets RIP=foo. `jmp *foo` sets RIP=result of a load from that memory address. (Although of course in 64-bit code, you normally address static storage with RIP-relative addressing, so `jmp *foo(%rip)` dereferences the global variable foo, assuming it holds a function pointer and this is a tail-call. – Peter Cordes May 20 '21 at 21:10
5

It's a jump to an address contained in memory. The address is stored in memory at address rax*8+0x402680, where rax is the current rax value (when this instruction executes).

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
4

As Necrolis wrote, Intel syntax makes it a bit more obvious, but RTN is really clearer. The line

jmpq   *0x402680(,%rax,8)

would be described in RTN by:

RIP <- M[0x402680 + (8 * RAX)]

where M is the system memory.

As such, we can write the general form jmpq *c(r1, r2, k), where c is an immediate constant, r1 and r2 are general purpose registers and k is either 1 (default), 2, 4 or 8:

RIP <- M[c + r1 + (k * r2)]
eepp
  • 7,255
  • 1
  • 38
  • 56
4

Minimal example

To make things clearer:

.data
    # Store he address of the label in the data section.
    symbol: .int label   
.text   
    # Jumps to label.
    jmp *symbol
    label:

GitHub upstream.

Without the *, it would jump to the address of symbol in the .data section and segfault.

I feel this syntax is a bit inconsistent, because for most instructions:

mov symbol, %eax
mov label, %eax

already moves the data at the address symbol, and $symbol is used for the address. Intel syntax is more consistent in this point as it always uses [] for dereference.

The * is of course a mnemonic for the C dereference operator *ptr.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
  • re: the "inconsitency": jumps "access" the code *at* the label. `jmp foo` causes the CPU to load the bytes at `foo` as code (by setting RIP). vs. `mov foo, %eax` loading the bytes at foo into a register. I'm not sure that a syntax design that used `jmp $foo` for direct relative jumps would be good, because normally the `$` decorator only applies to (absolute) immediates, not relative encodings. `.long foo` emits the address of the symbol. And we really *don't* want `jmp foo` to be the syntax for memory-indirect `RIP = load(foo)`; that's a terrible failure mode. – Peter Cordes Jun 12 '19 at 20:11
  • Plus, `jmp` treats its args different from other instructions anyway; it only supports an addressing mode for memory-indirect jumps, so there's no need to disambiguate immediate vs. anything else without `*`. – Peter Cordes Jun 12 '19 at 20:13
  • See also [Warning: indirect call without \`\*'](https://stackoverflow.com/a/67171563) - my answer there gives similar reasoning to my comment above about why AT&T requires a `*` to disambiguate indirect – Peter Cordes Apr 20 '21 at 02:46