1

I have several questions about memory and registers in X86 assembly:

  1. I have a string "abcdefgh", and register %eax holds a pointer to the string. Now I use movl (%eax), %edx to grab the first four bytes of the string into %edx. How are they stored in the register? Is the character d in the %dl register, or is it the character a?

  2. When using movb %eax, %dl for example, which of %eax's bytes does it actually move? The one in %al or the opposite one? Is it even possible to do this? Or should I use a pointer like this - movb (%eax), %dh - to take the first byte the pointer points to?

rkhb
  • 14,159
  • 7
  • 32
  • 60
Sunspawn
  • 817
  • 1
  • 12
  • 27
  • 1
    If you want to copy the low byte of `%eax`, use `mov %al, %dl`. To move the 2nd byte, use `mov %ah, %dl`. To get one of the other two bytes, copy/shift/mask, or use BMI1 `bextr`. To extend, use `movsx` / `movzx` to sign/zero extend, like `movsx %al, %edx`. Or `movsxbl (%eax), %edx` to sign-extend a byte from memory, pointed to by `%eax`. See http://stackoverflow.com/questions/34058101/referencing-the-contents-of-a-memory-location-x86-addressing-modes is related to this, but uses Intel syntax. – Peter Cordes Dec 18 '15 at 00:41
  • Wait, so assuming EAX points to string `"abcd"`, does using `movb (%eax), %cl` move `a` or `d` to `%cl`? – Sunspawn Dec 18 '15 at 09:43
  • 1
    m0skit0's answer is correct: A byte load from a pointer to the start of the string will load the first byte: `'a'` More interesting is the fact that a 32b load from the same address into `%ecx` will still load `'a'` into `%cl`, opposite of what you'd get on a big-endian machine. My comment was in response to the register-register move part of your question, where you copy a byte of `%eax`, rather than dereferencing it. – Peter Cordes Dec 18 '15 at 10:21
  • One final question: can I use `cmpb %bl, (%eax)` to compare the byte in `%bl` and the first byte that `%eax` points to? – Sunspawn Dec 18 '15 at 14:19
  • 1
    Yup, of course. `cmp` can have a memory operand as either the first or 2nd operand. Also note that you only need the size suffix in cases where it's ambiguous (e.g. immediate and memory operand, like `cmpb $0xab, (%eax)` vs. `cmpl $0xab, (%eax)`). – Peter Cordes Dec 19 '15 at 08:59

1 Answers1

5

Assuming you're using the unusual GAS' syntax (source is the first operand, destination is the second one) and not Intel's :

How are they stored in the register? Is the character d in the %dl register, or is it the character a?

Since you're accessing the string as if it was a 32-bit number, endianness applies. x86 is little-endian so you get the least-significant byte at the lowest address, so DL will hold 'a' (0x61), and the whole EDX would be 0x64636261.

When using movb %eax, %dl for example, which of %eax's bytes does it actually move? The one in %al or the opposite one? Is it even possible to do this?

That would give an syntax error because operands are of different size. You can't move 32 bits to 8 bits.

Or should I use a pointer like this - movb (%eax), %dh - to take the first byte the pointer points to?

If you want to access the data pointed by EAX and not EAX itself, then yes, that should work.

m0skit0
  • 25,268
  • 11
  • 79
  • 127