What is the proper way to convert an __int64
value to an __m64
value for use with SSE?
Asked
Active
Viewed 3,642 times
8

Paul R
- 208,748
- 37
- 389
- 560

user541686
- 205,094
- 128
- 528
- 886
-
For the googlers, can someone explain `__int64` vs `__m64`? :-) – Ciro Santilli OurBigBook.com May 31 '19 at 08:47
1 Answers
9
With gcc you can just use _mm_set_pi64x
:
#include <mmintrin.h>
__int64 i = 0x123456LL;
__m64 v = _mm_set_pi64x(i);
Note that not all compilers have _mm_set_pi64x
defined in mmintrin.h
. For gcc it's defined like this:
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_set_pi64x (long long __i)
{
return (__m64) __i;
}
which suggests that you could probably just use a cast if you prefer, e.g.
__int64 i = 0x123456LL;
__m64 v = (__m64)i;
Failing that, if you're stuck with an overly picky compiler such as Visual C/C++, as a last resort you can just use a union and implement your own intrinsic:
#ifdef _MSC_VER // if Visual C/C++
__inline __m64 _mm_set_pi64x (const __int64 i) {
union {
__int64 i;
__m64 v;
} u;
u.i = i;
return u.v;
}
#endif
Note that strictly speaking this is UB, since we are writing to one variant of a union and reading from another, but it should work in this instance.

Paul R
- 208,748
- 37
- 389
- 560
-
Where did you get this version of mmintrin.h from ? What compiler are you using ? For current versions of gcc (4.x) `__mm_set_pi64x` is defined in mmintrin.h. – Paul R Jan 30 '12 at 09:15
-
I'm using Visual Studio 2010... I got it from `C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\mmintrin.h`. Kinda confused... – user541686 Jan 30 '12 at 09:18
-
You should probably tag your question appropriately if you're using a non-standard compiler. See updated answer above for alternative suggestion. – Paul R Jan 30 '12 at 09:21
-
"Nonstandard" is in the eye of the beholder. It's pretty standard for Windows, which is also a pretty standard development environment... and it follows the C++ standard well enough for me (GCC isn't fantastic either). :) Anyway, using your second example, I get `error C2440: 'type cast' : cannot convert from '__int64' to '__m64'`. – user541686 Jan 30 '12 at 09:27
-
Visual C/C++ is just about the worst compiler for SSE work - stuff that just works with gcc, ICC and other standard compilers often doesn't work with Microsoft compilers - you end up coding to the "lowest common denominator". I suggest that if you're stuck with Windows then you should at least switch to Intel's ICC compiler, which is a lot better in every regard (including performance of generated code). – Paul R Jan 30 '12 at 09:42
-
If you have $1,899 to spare and buy me ICC, I'd gladly switch to it. ;) – user541686 Jan 30 '12 at 09:45
-
Well if you're developing a commercial product then $1,899 is a very small investment which will more than pay for itself. If this is just for a personal project or free software though then you can use the union implementation above. – Paul R Jan 30 '12 at 09:53
-
1My experience with ICC has been mixed. Although it "generally" compiles faster code, I've seen numerous cases of it going brain-dead and getting beaten out by MSVC (by large margins). – Mysticial Jan 30 '12 at 09:56
-
@PaulR: It's indeed for a personal project. Btw, are you *sure* the union method works? I've tried doing something similar before and gotten myself into trouble with access violations and such... – user541686 Jan 30 '12 at 09:59
-
@Mehrdad: so long as you don't have any alignment issues then the union method should work - I've had to use this method for similar workarounds with Visual C in the past. – Paul R Jan 30 '12 at 10:07
-
@Mysticial: mostly I use gcc as a baseline when benchmarking SIMD code and ICC generally beats gcc, or at least matches it. I have to maintain compatibility with MSVC though so I do a little benchmarking from time to time, and MSVC-generated *SIMD* code is usually much slower - occasionally though MSVC will excel at some *scalar* code optimisations, I have to admit. – Paul R Jan 30 '12 at 10:09
-
1@PaulR From my experience: Prior to VS2010, ICC consistently beats MSVC on nearly all SSE code I write. Starting from VS2010, I have to admit that MSVC beats ICC more in more than half the cases I've done. A notorious example of ICC optimization fail is on my [answer here](http://stackoverflow.com/a/8391601/922184). MSVC (and GCC with the right options) gets peak performance. ICC fails to get even 70%. – Mysticial Jan 30 '12 at 10:15
-
@Mysticial: thanks - that's useful information - due to product cycles etc I still don't use anything newer than VS 2008, but it sounds like it might be worth re-evaluating some of our SIMD benchmarks with VS 2010. I doubt they have fixed any of other other annoyances though (still no C99 support after > 10 years, unnecessary ABI restrictions, etc) – Paul R Jan 30 '12 at 10:18