4

I am trying to understand why Visual Studio 2012 (x64) doesn't want to vectorized a conversion from a short to a float. Does anybody have a reason or a way around?

//unsigned short* __restrict A,B,C,D    
for (int j = 0; j < H*W;j++) 
{
    float Gs = D[j]-B[j];
    float Gc = A[j]-C[j];
    in[j]=atan2f(Gs,Gc);
}

info C5002: loop not vectorized due to reason '1101'

RESOLUTION

Runtime using shorts and not vectorizing is about 800ms

Runtime converting to all ints and auto vectorizing is about 140ms (!!!)

Mikhail
  • 7,749
  • 11
  • 62
  • 136
  • 2
    One way is to use SSE4.1 to convert `short` -> `int`. Then use the `int` -> `float` conversion intrinsic. – Mysticial Mar 22 '13 at 04:21

1 Answers1

2

From this page, it appears that your "Loop contains a non-vectorizable conversion operation (may be implicit)". Have you tried first converting to a type which is the same width as a float (such as int)?

For a more concrete reason, see here. Apparently, there is no direct way in SSE to convert an SSE register consisting of a vector of shorts to a vector of floats, however there is an instruction that converts 32-bit integers to floats.

Community
  • 1
  • 1
John Colanduoni
  • 1,596
  • 14
  • 18
  • 1
    Using an intermediate such as `int foo = D[j]-B[j]` doesn't seem to help it along. I guess I may need to change all my shorts to floats. – Mikhail Mar 22 '13 at 04:17
  • You should cast to ints before the subtraction, not after. That way the subtraction itself can be written as a vector operation on ints, and then be vectorized. Also, you shouldn't need to change all your shorts to floats; just change them to ints. – John Colanduoni Mar 22 '13 at 04:19