8

What is the proper way to convert an __int64 value to an __m64 value for use with SSE?

Paul R
  • 208,748
  • 37
  • 389
  • 560
user541686
  • 205,094
  • 128
  • 528
  • 886

1 Answers1

9

With gcc you can just use _mm_set_pi64x:

#include <mmintrin.h>

__int64 i = 0x123456LL; 
__m64 v = _mm_set_pi64x(i);

Note that not all compilers have _mm_set_pi64x defined in mmintrin.h. For gcc it's defined like this:

extern __inline __m64  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_set_pi64x (long long __i)
{
  return (__m64) __i;
}

which suggests that you could probably just use a cast if you prefer, e.g.

__int64 i = 0x123456LL; 
__m64 v = (__m64)i;

Failing that, if you're stuck with an overly picky compiler such as Visual C/C++, as a last resort you can just use a union and implement your own intrinsic:

#ifdef _MSC_VER // if Visual C/C++
__inline __m64 _mm_set_pi64x (const __int64 i) {
    union {
        __int64 i;
        __m64 v;
    } u;

    u.i = i;
    return u.v;
}
#endif

Note that strictly speaking this is UB, since we are writing to one variant of a union and reading from another, but it should work in this instance.

Paul R
  • 208,748
  • 37
  • 389
  • 560
  • Where did you get this version of mmintrin.h from ? What compiler are you using ? For current versions of gcc (4.x) `__mm_set_pi64x` is defined in mmintrin.h. – Paul R Jan 30 '12 at 09:15
  • I'm using Visual Studio 2010... I got it from `C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\mmintrin.h`. Kinda confused... – user541686 Jan 30 '12 at 09:18
  • You should probably tag your question appropriately if you're using a non-standard compiler. See updated answer above for alternative suggestion. – Paul R Jan 30 '12 at 09:21
  • "Nonstandard" is in the eye of the beholder. It's pretty standard for Windows, which is also a pretty standard development environment... and it follows the C++ standard well enough for me (GCC isn't fantastic either). :) Anyway, using your second example, I get `error C2440: 'type cast' : cannot convert from '__int64' to '__m64'`. – user541686 Jan 30 '12 at 09:27
  • Visual C/C++ is just about the worst compiler for SSE work - stuff that just works with gcc, ICC and other standard compilers often doesn't work with Microsoft compilers - you end up coding to the "lowest common denominator". I suggest that if you're stuck with Windows then you should at least switch to Intel's ICC compiler, which is a lot better in every regard (including performance of generated code). – Paul R Jan 30 '12 at 09:42
  • If you have $1,899 to spare and buy me ICC, I'd gladly switch to it. ;) – user541686 Jan 30 '12 at 09:45
  • Well if you're developing a commercial product then $1,899 is a very small investment which will more than pay for itself. If this is just for a personal project or free software though then you can use the union implementation above. – Paul R Jan 30 '12 at 09:53
  • 1
    My experience with ICC has been mixed. Although it "generally" compiles faster code, I've seen numerous cases of it going brain-dead and getting beaten out by MSVC (by large margins). – Mysticial Jan 30 '12 at 09:56
  • @PaulR: It's indeed for a personal project. Btw, are you *sure* the union method works? I've tried doing something similar before and gotten myself into trouble with access violations and such... – user541686 Jan 30 '12 at 09:59
  • @Mehrdad: so long as you don't have any alignment issues then the union method should work - I've had to use this method for similar workarounds with Visual C in the past. – Paul R Jan 30 '12 at 10:07
  • @Mysticial: mostly I use gcc as a baseline when benchmarking SIMD code and ICC generally beats gcc, or at least matches it. I have to maintain compatibility with MSVC though so I do a little benchmarking from time to time, and MSVC-generated *SIMD* code is usually much slower - occasionally though MSVC will excel at some *scalar* code optimisations, I have to admit. – Paul R Jan 30 '12 at 10:09
  • 1
    @PaulR From my experience: Prior to VS2010, ICC consistently beats MSVC on nearly all SSE code I write. Starting from VS2010, I have to admit that MSVC beats ICC more in more than half the cases I've done. A notorious example of ICC optimization fail is on my [answer here](http://stackoverflow.com/a/8391601/922184). MSVC (and GCC with the right options) gets peak performance. ICC fails to get even 70%. – Mysticial Jan 30 '12 at 10:15
  • @Mysticial: thanks - that's useful information - due to product cycles etc I still don't use anything newer than VS 2008, but it sounds like it might be worth re-evaluating some of our SIMD benchmarks with VS 2010. I doubt they have fixed any of other other annoyances though (still no C99 support after > 10 years, unnecessary ABI restrictions, etc) – Paul R Jan 30 '12 at 10:18