SSE intrinsics - _mm_and_ps odd behaviour

Question

The following piece of code:

__m128 a   = _mm_setr_ps( 1, 2, 3, 4 );
__m128 b   = _mm_set1_ps( 2 );
__m128 res = _mm_and_ps( a, b );
cout << a[0] << " " << a[1] << " " << a[2] << " " << a[3] << endl;
cout << b[0] << " " << b[1] << " " << b[2] << " " << b[3] << endl;
cout << res[0] << " " << res[1] << " " << res[2] << " " << res[3] << endl;
cout<<endl;
cout << ( 1 & 2 ) << " " << ( 2 & 2 ) << " " << ( 3 & 2 ) << " " << ( 4 & 2 ) << endl;

results in:

Shouldn't the result of the SSE operation be 0 2 2 0 because 2 = 010, 4 = 100 => 2&4 = 0.
According to the documentation:

__m128 _mm_and_ps(__m128 a, __m128 b)

Computes the bitwise AND of the four SP FP values of a and b.

R0 R1 R2 R3

a0 & b0 a1 & b1 a2 & b2 a3 & b3

Why don't you show the intermediate values of `a` and `b`? – Jonathon Reinhart Dec 27 '17 at 11:49 — Jonathon Reinhart, Dec 27 '17 at 11:49

Martin Bonner supports Monica · Accepted Answer · 2017-12-28T09:42:46.880

The documentation I found says:

Computes the bitwise AND of the four single-precision, floating-point values of a and b.

(my emphasis)

2 and 4 will have the same mantissa (0, plus an implied leading 1 bit), and exponents of 128 and 129 respectively. The bitwise and of those is a zero mantissa and an exponent of 128 (== 2.0).

Edit

If you want to do a bit-wise AND of non-negative integers, you can add an offset. If you use an offset of 8388608 (== 1<<23), then you can do bitwise operations on 0..8388607 as you would expect.

const float offset=8388608;
__m128 mm_offset = _mm_set1_ps();
__m128 a   = _mm_setr_ps( 1, 2, 3, 4 );
a =_mm_add_ps(mm_offset,a);
__m128 b   = _mm_set1_ps( 2+offset );
__m128 res = _mm_and_ps( a, b );
res = _mm_sub_ps(res,mm_offset);

Or you could use SSE1 `andps` on integer data if you want. ([There are minor performance implications](https://stackoverflow.com/questions/26942952/difference-between-the-avx-instructions-vxorpd-and-vpxor)). It's still an AND, it doesn't care about the meaning of the bits until you use an actual FP instruction. If you never do that, and just store the result, then you're fine. — Peter Cordes, Dec 28 '17 at 16:27

SSE intrinsics - _mm_and_ps odd behaviour

1 Answers1