1

When compile(x86-64 gcc 11.2 with -O2) the code below:


double foo(unsigned int x) {
    return x/2.5;
}

double zoo(int x) {
    return x/2.5;
}

int main(){
    return 0;
}

I got the assembly like this:

foo(unsigned int):
        mov     edi, edi
        pxor    xmm0, xmm0
        cvtsi2sd        xmm0, rdi
        divsd   xmm0, QWORD PTR .LC0[rip]
        ret
zoo(int):
        pxor    xmm0, xmm0
        cvtsi2sd        xmm0, edi
        divsd   xmm0, QWORD PTR .LC0[rip]
        ret
main:
        xor     eax, eax
        ret
.LC0:
        .long   0
        .long   1074003968

Why does GCC insert mov edi, edi for the function doing floating-point division on an unsigned int but not for the signed one?

CodeLink: https://gcc.godbolt.org/z/f1hKGrW6T

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
konchy
  • 573
  • 5
  • 16
  • 1
    Note: that's not the only difference, look at `cvtsi2sd`. – Weather Vane Sep 18 '21 at 06:54
  • 2
    Isn't the `mov edi, edi` just used to clear the upper 32 bits of `rdi`? That way, `rdi` is guaranteed to contain an unsigned 64-bit value when the `cvtsi2sd` gets executed. – Michael Sep 18 '21 at 07:20
  • It's zero-extending so it can use 64-bit signed -> FP conversion to implement u32 -> FP. It would be more efficient (mov-elimination) if it picked a destination register other than `edi`, but it's still better than doing i32->FP conversion and fixing that up. – Peter Cordes Sep 18 '21 at 07:36
  • That `/` isn't doing integer division; one of the operands is a `double` so both sides get implicitly converted to `double`. Hence `divsd` rather than multiply by 2 and integer divide by 5. – Peter Cordes Sep 18 '21 at 07:39
  • Added an answer on [Are there unsigned equivalents of the x87 FILD and SSE CVTSI2SD instructions?](https://stackoverflow.com/a/69233028) which covers exactly why it's doing this. That Q&A had a lot of focus on 32-bit methods until I answered it. – Peter Cordes Sep 18 '21 at 08:39

1 Answers1

1

I am not well-versed in this field, but this interesting question urged me to google around and it is such a fascinating spiral. Let me share my findings.

  • The purpose of mov edi, edi is to zero the top 32 bits of rdi register. (edi actually refers to the lower 32 bits of rdi.)
  • Why it happens: 32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register. (http://x86asm.net/articles/x86-64-tour-of-intel-manuals/)
  • Why was this behaviour introduced: to avoid partial register stall

Partial register stall is a problem that occurs when we write to part of a 32-bit register and later read from the whole register or a bigger part of it.
They cause performance penalty (Why doesn't GCC use partial registers?)

  • Why is this relevant to our question? I do not 100% understand the reason but here is what I found:

Since an n-bit bitstring can be interpreted semantically both as an unsigned as well as signed integer, we use sign extension to make things clear. (https://cs.stackexchange.com/questions/81320/signed-and-unsigned-loads-in-a-32-bit-registers)

I plan to read more into these and update the answer once I gain a better understanding.

ecm
  • 2,583
  • 4
  • 21
  • 29
Niket
  • 63
  • 1
  • 5