10

I have been playing with intel mpx and found that it adds certain instructions that I could not understand. For e.g. (in intel format):

movsxd rdx,edx

I found this, which talks about a similar instruction - MOVSX.

From that question, my interpretation of this instruction is that, it takes double byte (that's why there is a d in movsxd) and it copies it into rdx register (in two least significant bytes) and fills the rest with the sign of that double byte.

Is my interpretation correct (I think I'm wrong)? If not can you please tell me what is going on?

R4444
  • 2,016
  • 2
  • 19
  • 30
  • 4
    Yes. It's just a signed conversion from 32 to 64 bits. – Jester Jun 12 '19 at 15:18
  • 3
    BTW, the `d` is not "double byte" of course, it's double **word**. Furthermore I guess you know that `edx` is the low 32 bits of `rdx` so in this case no copying takes place, those bits stay where they are. Only the top 32 bits are filled with the sign. Also see the official intel instruction set reference documents, along with the well known [online conversion](https://www.felixcloutier.com/x86/movsx:movsxd). – Jester Jun 12 '19 at 15:23
  • Yes, it made a perfect sense to me after your first comment and I realized my mistake. Thanks so much for your answer and link. – R4444 Jun 12 '19 at 15:28
  • 1
    Always check the instruction set reference when you encounter an unknown instruction. – fuz Jun 12 '19 at 16:34
  • thanks @fuz, I'll keep that in mind – R4444 Jun 12 '19 at 16:41

1 Answers1

17

Your code is 64-bit. If you look at the instruction set architecture (ISA) manual for MOVSXD, the 64-bit variant is defined as:

 MOVSXD r64, r/m32       Move doubleword to quadword with sign-extension.

This is the instruction in 64-bit code that takes a 32-bit register or an address to a 32-bit value and moves it sign extended into a 64-bit register. Sign extension is taking the value of the top most bit (sign bit) of the source and using it to fill in all the upper bits of the destination.

movsxd rdx,edx takes a look at bit 31 (top most bit) of EDX and sets the upper 32 bits of the destination to that value and copies the lower 32 bits as is. If the sign bit is set in EDX the upper 32 bits of the 64-bit register will be set to 1. If the sign bit is clear the upper 32 bits of RDX will be 0.

As an example, assume EDX has the value 0x80000000. Bit 31 is 1. As a signed number that is -2147483648. If you do movsxd RDX, EDX the value in RDX will be 0xFFFFFFFF80000000 . As a signed 64-bit value that still represents -2147483648.

If EDX had been 0x7fffffff (signed value +2147483647) with bit 31 being 0, the value in RDX would have been 0x000000007fffffff which still represents the signed number +2147483647. As you can see sign extension preserves the sign bit across the upper bits of a wider register so that the signedness of the destination is preserved.

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • 3
    Even more interesting it's the existence of `movsxd r16, r/m16` aka `63 /r` without a `REX.W` (and the 32-bit version). A new encoding was needed but who knows why they changed the instruction name, after all by prefixing an operand size override before `movsx eax, ax` one can create `movsx ax, ax` so it's not like anything new. – Margaret Bloom Jun 12 '19 at 15:42
  • @MargaretBloom: I wonder if AMD created a separate mnemonic just for documentation purposes, so they could list it separately from 386+ `movsx`? Or as part of a list of new instructions / opcode changes for AMD64? But yeah, these days it's a pointless inconvenience that some assemblers don't accept `movsx rax, edx` or `movsx rax, dword [rdi]` and force you to write out the `movsxd` mnemonic for movsx with dword operand-size. Having a choice of 2 instead of 3 opcodes for different source sizes doesn't seem like a problem. The only excuse now is that it's shorter than `movsx` (1 byte vs 2) – Peter Cordes Jun 12 '19 at 20:41
  • @MargaretBloom: or maybe AMD wanted you to be able to write `movsxd rax, [rdi]` without an operand-size specifier. (Fun fact: YASM always requires `dword`: [yasm movsx, movsxd invalid size for operand 2](//stackoverflow.com/q/47350568) according to an answer I wrote a couple years ago.) – Peter Cordes Jun 12 '19 at 20:48
  • @PeterCordes Yeah, probably AMD has something to do with this :) NASM seems a little behind on this (at least my version which is a bit old), for example it doesn't support `movsxd eax, eax`, not even when disassembling. – Margaret Bloom Jun 13 '19 at 07:51