0

I'm just getting into learning assembly code and I have a couple of questions concerning this snippet. Assume %r8 is an array of chars, %r9 is an array of integers and %r10 is an array of long integers.

test:
 mov $128, %axh
 movzx $128, %bx
 movq   $64, 4(%r8)
 movd  $255, 4(%r9)
 movq  $255, 8(%r9)
 movq  $0xFFFFFFFF, 8(%r10)
 mov   %ah, (%r8)
 movq  $55, (%r8) 
ret

First of all, does the register %axh exist or is this a typo for %ah?

Secondly, they do a movq of 64 into index 4 of array %r8. But I learned earlier that chars are represented in 8 bits tops, but a movq moves a 64 bit value (correct?). The same happens for index 8 of array %r9. This is confusing to me, because the bits stored exceed the total bits needed for these datatypes.

Finally, I wonder why ret is called since there is no 'return' register at play here. Or does an assembly process always return, even if it returns nothing?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Daniel
  • 1
  • If there was no `ret` it would try to continue executing instructions following the last `movq`, which sounds like the wrong behavior – UnholySheep Dec 22 '21 at 19:46
  • The CPU doesn't know about the data type of the array. If you move a 64 byte integer with value 64 to a memory location, it will store this in little-endian order. If you then interpret this as an array of chars, you have effectively written 8 chars. The first has value 64, the remaining ones are set to zero. – Homer512 Dec 22 '21 at 19:55
  • 3
    There is no register named `axh`. Also, `movzx $128, %bx` is a invalid. There is no `movzx` instruction in AT&T syntax and the one in Intel syntax does not take immediates. – fuz Dec 22 '21 at 19:55
  • The `return` register pops the address to return to off the stack. The processor has no notion of return value. Such a thing is merely a convention established by an ABI. – fuz Dec 22 '21 at 19:56
  • So if I store a 16-bit 255 (binary: 0000 0000 1111 1111) into an array of chars (8bit) at index 4, this would mean that the first 8 bits are stored at index 4 and the remaining 8 bits are stored at index 5? @Homer512 – Daniel Dec 22 '21 at 20:12
  • 2
    @Daniel Yes. Using larger register sizes to process multiple chars (or short ints) in one operation is a common technique to improve performance – Homer512 Dec 22 '21 at 20:17
  • 1
    `movd` is also invalid, if this is supposed to be AT&T syntax. 32-bit operand-size uses an `l` suffix, like `movl`. Where did you get this invalid nonsense from? Possibly from the "global" (non-US) version of CS:APP 3rd edition? See [CS:APP example uses idivq with two operands?](https://stackoverflow.com/q/57998998) about the practice problems in that version of the book being full of garbage introduced by the publisher, not the authors. – Peter Cordes Dec 23 '21 at 04:31
  • Re: `ret`: if the CPU doesn't `pop` a return address into RIP, execution isn't going to go back to the caller. [What if there is no return statement in a CALLed block of code in assembly programs](https://stackoverflow.com/q/41205054) This is apparently supposed to be a function, or the end of a function. Although it doesn't follow any standard calling convention: `%rbx` is call-preserved in x86-64 System V and Windows x64, and R10 is a "static chain pointer" for languages that let you take function-pointers to nested functions. Otherwise it's not arg-passing. – Peter Cordes Dec 23 '21 at 04:34
  • See also other links in https://stackoverflow.com/tags/x86/info – Peter Cordes Dec 23 '21 at 04:35

0 Answers0