10

Is there a way to check if all bits/bytes/words etc. in a __m128i variable are 0?
In my app I have to check if all integers packed in a in a __m128i variable are zeroes. Will I have to extract them and compare each separately?


Edit:

What I am doing now is:

int next = 0;
do{
    //some code

    next = idata.m128i_i32[0] + idata.m128i_i32[1] + idata.m128i_i32[2] + idata.m128i_i32[3];
}while(next > 0);

What I need is to check if idata is all zeroes without having to access each individual element, and quit the loop if they are...


Based on Harold's comment this is the solution:

__m128i idata = _mm_setr_epi32(i,j,k,l);
do{
    //some code
}while( !_mm_testz_si128(idata, idata) );

This will exit the loop if all low bits of each DW in idata are 0... thanks harold!

phuclv
  • 37,963
  • 15
  • 156
  • 475
Daniel Gruszczyk
  • 5,379
  • 8
  • 47
  • 86
  • Can't you use, say, `PCMPEQD` to compare without extraction? – Sergey Kalinichenko Apr 16 '12 at 14:14
  • Do XMM registers have a flag register attached to them? If yes, there must be a zero flag among these bits. – user703016 Apr 16 '12 at 14:15
  • 3
    See `PTEST` is SSE4 available, otherwise it take slightly more effort. – harold Apr 16 '12 at 14:20
  • 6
    You don't need to initialise a dummy argument for the second parameter of `PTEST`, i.e. instead of `_mm_testz_si128(idata, _mm_set1_epi32(0xFFFF))` you can just do `_mm_testz_si128(idata, idata)`. – Paul R Apr 16 '12 at 15:49
  • See also https://stackoverflow.com/q/27905677/ which has some interesting commentary and alternate (possibly faster) answers – Nemo Nov 03 '18 at 19:31
  • 1
    Possible duplicate of [Is an \_\_m128i variable zero?](https://stackoverflow.com/questions/7989897/is-an-m128i-variable-zero) – Antonio Apr 17 '19 at 00:55

2 Answers2

9

_mm_testz_si128 is SSE4.1 which isn't supported on some CPUs (e.g. Intel Atom, AMD Phenom)

Here is an SSE2-compatible variant

inline bool isAllZeros(__m128i xmm) {
    return _mm_movemask_epi8(_mm_cmpeq_epi8(xmm, _mm_setzero_si128())) == 0xFFFF;
}
Marat Dukhan
  • 11,993
  • 4
  • 27
  • 41
4

Like Paul R commented to my original post:

"You don't need to initialise a dummy argument for the second parameter of PTEST, i.e. instead of _mm_testz_si128(idata, _mm_set1_epi32(0xFFFF)) you can just test a value against itself."

ptest does the entire job with one instruction.

This helped.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Daniel Gruszczyk
  • 5,379
  • 8
  • 47
  • 86