1

I have some C++ code that is being compiled to the following assembly using MSVC compiler v14.24:

00007FF798252D4C  vmulsd      xmm1,xmm1,xmm7  
00007FF798252D50  vcvttsd2si  rcx,xmm1  
00007FF798252D55  vmulsd      xmm1,xmm7,mmword ptr [rbx+28h]  
00007FF798252D5A  mov         ecx,ecx  
00007FF798252D5C  imul        rdx,rcx,0BB8h  
00007FF798252D63  vcvttsd2si  rcx,xmm1  
00007FF798252D68  mov         ecx,ecx  
00007FF798252D6A  add         rdx,rcx  
00007FF798252D6D  add         rdx,rdx  
00007FF798252D70  cmp         byte ptr [r14+rdx*8+8],0  
00007FF798252D76  je          applyActionMovements+15Dh (07FF798252D8Dh)

As you can see, the compiler added two

mov         ecx,ecx

instructions that don't make any sense to me, because they move data from and to the same register.

Is there something that I'm missing?


Here is a small Godbolt reproducer: https://godbolt.org/z/UFo2qe

int arr[4000][3000];
inline int foo(double a, double b) {
    return arr[static_cast<unsigned int>(a * 100)][static_cast<unsigned int>(b * 100)];
}

int bar(double a, double b) {
    if (foo(a, b)) {
        return 0;
    }
    return 1;
}
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Acuion
  • 25
  • 4
  • That's an inefficient way to zero-extend ECX into RCX. More efficient would be `mov` into a different register [so mov-elimination could work](https://stackoverflow.com/questions/44169342/can-x86s-mov-really-be-free-why-cant-i-reproduce-this-at-all). – Peter Cordes Dec 25 '19 at 21:56

1 Answers1

3

That's an inefficient way to zero-extend ECX into RCX. More efficient would be mov into a different register so mov-elimination could work.

Duplicates of:

But your specific test-case needs zero-extension for a slightly non-obvious reason:

x86 only has conversion between FP and signed integers (until AVX512). FP -> unsigned int is efficiently possible on x86-64 by doing FP -> int64_t and then taking the low 32 bits as unsigned int.

This is what this sequence is doing:

vcvttsd2si  rcx,xmm1    ; double -> int64_t, unsigned int result in ECX
mov         ecx,ecx     ; zero-extend to promote unsigned to ptrdiff_t for indexing
add         rdx,rcx     ; 64-bit integer math on the zero-extended result
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847