0

I have this assembly:

movzx eax, r8w
add r8d, 0x4
movzx edx, r8w
cmp edx, 0x1f4
movdqu xmm3, xmmword ptr [r9+rax*4+0xfb0]   ; Why "+" instead of ADD asm??
movdqu xmm1, xmmword ptr [r11+rax*4]        ; ??
movdqu xmm4, xmmword ptr [r10+rax*4]        ; ??

Why are there "+" and "*" operators shown instead of add and multiply ASM instructions to calculate the addresses? Surely the point of assembly is to completely break down the C++ in to only x86 instructions?

user997112
  • 29,025
  • 43
  • 182
  • 361
  • 10
    Because this is a single instruction. The x86 has addressing modes where it can perform this calculation as part of executing one single instruction. It can calculate an address that adds a constant and two registers and can multiply the registers by 2 or 4 or 8 as part of that single instrution. This is one reason it's considered a CISC chip not a RISC chip, because single instructions can do fairly complex things. – jcoder May 14 '14 at 16:03
  • 3
    @jcoder: Isn't that an *answer*? – T.J. Crowder May 14 '14 at 16:04
  • and also to note: rax is not meant to get changed, ( what Add would do ). – icbytes May 14 '14 at 16:11
  • 1
    this addressing mode is like LEA instruction. Have a look [here](http://stackoverflow.com/questions/1658294/whats-the-purpose-of-the-lea-instruction) – phuclv May 14 '14 at 16:38
  • note that in case of multiplying by a power of 2 like the above, you don't need to multiply, use shift instead – phuclv May 14 '14 at 16:39
  • @LưuVĩnhPhúc: You're 100% right. The question should be, why does 80x86 assembly use multiplication and not shift! Of course good assemblers will convert `[eax*5]` into `[eax<<2 + eax]` (the assembly doesn't necessarily have to match the machine code). – Brendan May 14 '14 at 17:44
  • @jcoder's comment should be the accepted answer... – Alex D Mar 16 '15 at 06:45

1 Answers1

1

The first part of your question (why this is not an ADD instruction) and the second part (the *) have the same answer. The instruction here is trying to use the registers to form an index. This allows indexed access to the memory and is extremely useful in accessing logical arrays defined in the memory.

Two important points - the multiplicand cannot be any arbitrary value, it has to be one of 1, 2, 4 or 8 - this limits the direct array access only to elements up to 64 bits (qword) size. Had it allowed 16 or 32, a direct memory array of xmmword or ymmword would have been possible. The immediate offset that can be used is restricted to a maximum of 32-bit long number even in native 64-bit code - though usually, that should not cause so much a problem.

The second point is that this addition happens significantly faster than usual instruction based additions - so an instruction sequence like

shl rbx,1
mov rax, qword ptr [rsi+rbx]

is significantly slower than

mov rax,qword ptr [rsi+2*rbx]
Toby Speight
  • 27,591
  • 48
  • 66
  • 103
quasar66
  • 555
  • 4
  • 14