2

I am trying to understand the _mm256_testc_pd, _mm256_testz_pd, and _mm256_testnzc_pd intrinsics, and I have a hard time understanding them.

To analyze _mm256_testc_pd, I have identified the following cases (a is the first, b the second __mm256d argument`):

  • If all packed doubles in b are > 0, then ZF=1, CF=1, except:
  • If one packed double in each a and b are mutally < 0, then ZF=0.
  • If one packed double in each !a and b are mutually < 0, then CF=0.

In other words, a value of ZF=1 tells me that either a) b is entirely positive, or b) that for those doubles in b that are not positive, there is a matching double in a that is negative. A value of CF=1 tells me that either c) b is entirely positive, or d) that for those doubles in b that are not positive, there is a matching double in in !a that is negative.

Have I understood this correctly? I am a bit confused by this. What's the point of this check? What would I use these intrinsics for?

Paul R
  • 208,748
  • 37
  • 389
  • 560
mSSM
  • 598
  • 5
  • 12
  • 1
    The most common use case is testing the result of a compare operation, where the result elements are either all 1 or all 0 (so testing the sign bit is sufficient) - this enables you to implement predicates such as "all equal" or "any greater than" etc, using the `testz` intrinsic. (I've never found a use for any of the other variants). – Paul R Nov 21 '18 at 11:40
  • @PaulR: Might as well move that to an answer. – Jason R Nov 21 '18 at 13:01
  • @JasonR: yes, you're right - I got carried away with what was originally just going to be a short comment. ;-) – Paul R Nov 21 '18 at 13:07
  • 1
    Usually you use `vptest` or `vtestpd` with one operand being a constant mask, not two variables. e.g. to check for any element being negative (having its sign bit set). Related: [Can PTEST be used to test if two registers are both zero or some other condition?](https://stackoverflow.com/a/43712244). – Peter Cordes Nov 21 '18 at 16:53

1 Answers1

2

The most common use case is testing the result of a compare operation, where the result elements from the comparison are either all 1 or all 0 (so testing the sign bit is sufficient) - this enables you to implement predicates such as "all equal" or "any greater than" etc, using the _mm*_testz_p* intrinsic.

I've never found a use for any of the other variants.

Paul R
  • 208,748
  • 37
  • 389
  • 560