Understanding/Writing C code for Assembly with single precision floating point numbers

Question

I am having trouble deciphering this given Assembly code:

1 unknown5:
2    pxor   %xmm0, %xmm0
3    movl   $0, %eax
4    jmp    .L13
5  .L14:
6    movss  (%rdx,%rax,4), %xmm1
7    mulss  (%rsi,%rax,4), %xmm1
8    addss  %xmm1, %xmm0
9    addq   $1, %rax
10 .L13:
11   cmpq   %rdi, %rax
12   jb .L14
13   rep ret

What I think is going on:

1: function named unknown5()
2: sets the floating point return register to 0
3: sets the 32-bit return register to 0
4: jumps to 10:
11: if %rdi > %rax, go to 5:, else go to 13:
6: %xmm1 = %rdx+%rax+4
7: %xmm1 *= %rsi+%rax+4
8: %xmm0 += %xmm1
9: %rax += 1
13: return %eax, %rax, or %xmm0??

What I have no clue of is what the possible parameters are of the function, nor what each register is representing. I know %rdi, %rsi, and %rdx are first, second, and third parameters (that might be floating point pointers due to single precision operations?), and %xmm0, %xmm1 are first and second floating point parameters (OR local variables?). I also know %rax is the 64-bit return register, but am unsure why it is used here as such. Could someone please elaborate and correct my comprehension? Thank you.

Assembly doesn't have "local variables", you're thinking of "registers". — tadman, Apr 10 '21 at 02:34
I know, I should have clarified that I need to convert the above Assembly into C, so the registers would either be representing the function parameters or local variables depending — Pharros, Apr 10 '21 at 02:39
With a little experience you will come to recognize this as a typical `while` loop. — Some programmer dude, Apr 10 '21 at 02:50
Fun fact: this appears to have been compiled by `gcc -O1` (`mov $0, %eax` instead of `xor %eax,%eax`, and the jump into the loop for the first iteration instead of hoisting the first check), with some GCC version older than GCC8 or so (`rep ret` because its `-mtune=generic` default still cares about AMD K8/K10 (e.g. Phenom II and older)). Also actual `gcc`, not `g++`, because `g++ -O1 / -O0` tends to make dumb "while" loops with a conditional branch at the top ([Why are loops always compiled into "do...while" style (tail jump)?](https://stackoverflow.com/q/47783926)) — Peter Cordes, Apr 10 '21 at 04:00

Nate Eldredge · Answer 1 · 2021-04-10T03:06:35.613

A few hints:

%xmm0, %xmm1 are used to pass floating-point parameters, if the function actually takes any. But are the values they had on entry to the function actually used in any way?
Don't think about %rax as inherently being "a return register". It's just a register. If the function is meant to return an integer or pointer then yes, %rax (or one of its sub-registers) is where that return value will go, but in the meantime it's just a register that the code can use as it likes, like a local variable as you say. Same goes for %xmm0.
Your guess about %rdx and %rsi being pointers is a good one, given how they are used in the movss/mulss memory operands. Where exactly are these instructions loading values from? On the other hand, %rdi does not appear in such a context; how is it used instead?

Note movss (%rdx,%rax,4), %xmm1 doesn't set %xmm1 equal to the value gotten by multiplying %rax by 4 and adding %rdx. Rather, it uses %rdx+%rax*4 as an address, fetches a single-precision float from that address in memory, and loads it into %xmm1. That's what the ( ) signify - an indirect memory reference. If you like, it's the rough equivalent of the unary * dereference operator in C.

Likewise, the value being multiplied in the following mulss is also fetched from memory.
There's no certain way to tell from the code alone whether it's intended to return an int (return value in %eax) or long (%rax) or float/double (%xmm0). But you can take a very good guess. Think about the values that each of those registers has when the function returns. Which seems more plausible as a value for the function to be computing for its caller?

Notice there are no instructions in this function that write to memory, so it has no "side effects"; its sole purpose must be to compute and return some value.
This function appears to be performing a familiar mathematical operation. Can you tell what it is?

1) What I'm confused about is can't they also be used for local variables in the C code (e.g. float foo = 3.0;)? And depending on how Assembly utilizes the registers is how one determines the differences? 2) Ahh this clears up a lot of confusion with %rax, thanks! 3) So %rdi may be a passed-in integer, while the others may be float pointers? And am completely lost on where they may be loading their values from. 4) So I guess it would return %xmm0 since they are single precision floating points? 5) Not until I figure more out haha. — Pharros, Apr 10 '21 at 02:59
Sure, a register is nothing more than a place to keep a bunch of bits, and from which the CPU can perform various operations on those bits. Whether you think of those bits as representing a function parameter, or the value of a local variable, or some intermediate value in a computation, or some other piece of useful/useless data, is entirely up to your own imagination and the calling conventions. — Nate Eldredge, Apr 10 '21 at 03:02
@Pharros: Ah, I think you're confused about 6 and 7. `movss (%rdx,%rax,4), %xmm1` doesn't set `%xmm1` equal to the value gotten by multiplying `%rax` by 4 and adding `%rdx`. Rather, it uses `%rdx+%rax*4` as an *address*, fetches a single-precision float from that address in memory, and loads it into `%xmm1`. That's what the `( )` signify - an indirect memory reference. If you like, it's the rough equivalent of the unary `*` dereference operator in C. — Nate Eldredge, Apr 10 '21 at 03:05

Understanding/Writing C code for Assembly with single precision floating point numbers

1 Answers1