2

I am having trouble deciphering this given Assembly code:

1 unknown5:
2    pxor   %xmm0, %xmm0
3    movl   $0, %eax
4    jmp    .L13
5  .L14:
6    movss  (%rdx,%rax,4), %xmm1
7    mulss  (%rsi,%rax,4), %xmm1
8    addss  %xmm1, %xmm0
9    addq   $1, %rax
10 .L13:
11   cmpq   %rdi, %rax
12   jb .L14
13   rep ret

What I think is going on:

  • 1: function named unknown5()
  • 2: sets the floating point return register to 0
  • 3: sets the 32-bit return register to 0
  • 4: jumps to 10:
  • 11: if %rdi > %rax, go to 5:, else go to 13:
  • 6: %xmm1 = %rdx+%rax+4
  • 7: %xmm1 *= %rsi+%rax+4
  • 8: %xmm0 += %xmm1
  • 9: %rax += 1
  • 13: return %eax, %rax, or %xmm0??

What I have no clue of is what the possible parameters are of the function, nor what each register is representing. I know %rdi, %rsi, and %rdx are first, second, and third parameters (that might be floating point pointers due to single precision operations?), and %xmm0, %xmm1 are first and second floating point parameters (OR local variables?). I also know %rax is the 64-bit return register, but am unsure why it is used here as such. Could someone please elaborate and correct my comprehension? Thank you.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Pharros
  • 21
  • 1
  • Assembly doesn't have "local variables", you're thinking of "registers". – tadman Apr 10 '21 at 02:34
  • I know, I should have clarified that I need to convert the above Assembly into C, so the registers would either be representing the function parameters or local variables depending – Pharros Apr 10 '21 at 02:39
  • 1
    With a little experience you will come to recognize this as a typical `while` loop. – Some programmer dude Apr 10 '21 at 02:50
  • Fun fact: this appears to have been compiled by `gcc -O1` (`mov $0, %eax` instead of `xor %eax,%eax`, and the jump into the loop for the first iteration instead of hoisting the first check), with some GCC version older than GCC8 or so (`rep ret` because its `-mtune=generic` default still cares about AMD K8/K10 (e.g. Phenom II and older)). Also actual `gcc`, not `g++`, because `g++ -O1 / -O0` tends to make dumb "while" loops with a conditional branch at the top ([Why are loops always compiled into "do...while" style (tail jump)?](https://stackoverflow.com/q/47783926)) – Peter Cordes Apr 10 '21 at 04:00

1 Answers1

2

A few hints:

  • %xmm0, %xmm1 are used to pass floating-point parameters, if the function actually takes any. But are the values they had on entry to the function actually used in any way?

  • Don't think about %rax as inherently being "a return register". It's just a register. If the function is meant to return an integer or pointer then yes, %rax (or one of its sub-registers) is where that return value will go, but in the meantime it's just a register that the code can use as it likes, like a local variable as you say. Same goes for %xmm0.

  • Your guess about %rdx and %rsi being pointers is a good one, given how they are used in the movss/mulss memory operands. Where exactly are these instructions loading values from? On the other hand, %rdi does not appear in such a context; how is it used instead?

    Note movss (%rdx,%rax,4), %xmm1 doesn't set %xmm1 equal to the value gotten by multiplying %rax by 4 and adding %rdx. Rather, it uses %rdx+%rax*4 as an address, fetches a single-precision float from that address in memory, and loads it into %xmm1. That's what the ( ) signify - an indirect memory reference. If you like, it's the rough equivalent of the unary * dereference operator in C.

    Likewise, the value being multiplied in the following mulss is also fetched from memory.

  • There's no certain way to tell from the code alone whether it's intended to return an int (return value in %eax) or long (%rax) or float/double (%xmm0). But you can take a very good guess. Think about the values that each of those registers has when the function returns. Which seems more plausible as a value for the function to be computing for its caller?

    Notice there are no instructions in this function that write to memory, so it has no "side effects"; its sole purpose must be to compute and return some value.

  • This function appears to be performing a familiar mathematical operation. Can you tell what it is?

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
  • 1) What I'm confused about is can't they also be used for local variables in the C code (e.g. float foo = 3.0;)? And depending on how Assembly utilizes the registers is how one determines the differences? 2) Ahh this clears up a lot of confusion with %rax, thanks! 3) So %rdi may be a passed-in integer, while the others may be float pointers? And am completely lost on where they may be loading their values from. 4) So I guess it would return %xmm0 since they are single precision floating points? 5) Not until I figure more out haha. – Pharros Apr 10 '21 at 02:59
  • Sure, a register is nothing more than a place to keep a bunch of bits, and from which the CPU can perform various operations on those bits. Whether you think of those bits as representing a function parameter, or the value of a local variable, or some intermediate value in a computation, or some other piece of useful/useless data, is entirely up to your own imagination and the calling conventions. – Nate Eldredge Apr 10 '21 at 03:02
  • @Pharros: Ah, I think you're confused about 6 and 7. `movss (%rdx,%rax,4), %xmm1` doesn't set `%xmm1` equal to the value gotten by multiplying `%rax` by 4 and adding `%rdx`. Rather, it uses `%rdx+%rax*4` as an *address*, fetches a single-precision float from that address in memory, and loads it into `%xmm1`. That's what the `( )` signify - an indirect memory reference. If you like, it's the rough equivalent of the unary `*` dereference operator in C. – Nate Eldredge Apr 10 '21 at 03:05