3

It seems there is no intrinsic for bitwise NOT/complement in AVX2. Did I miss it, or are we supposed to do something like _mm256_xor_si256(a, _mm256_set1_epi64x(-1LL)) ? If the latter, is it optimal? Is there no vector NOT instruction in assembly either?

Serge Rogatch
  • 13,865
  • 7
  • 86
  • 158
  • 1
    What is C++ about this? – Yunnosch Sep 18 '17 at 05:48
  • you mean bitwise not? because logical not is achieved by cmp instructions – phuclv Sep 18 '17 at 05:48
  • @LưuVĩnhPhúc, I've changed the wording to "bitwise", if it's more clear. Though Intel's manual refers to this group of intrinsics as "logical": https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=5212,1794,594,533,533,3146,5223,5218,5213,5226,5224,4642,3170,3170,3170,2670,4708,5068,5661,740,740,5459,5495,568,594,3889,390,3567,3576,3923,5553,2381,4929,201,406,694,4602,5003,5098,555,5089,4648,4642,5105,4893,555,5159,5340,5340,5340,307,1481,3625,415,1481,5720&techs=AVX,AVX2,FMA,Other&cats=Logical – Serge Rogatch Sep 18 '17 at 05:52
  • @Yunnosch, the question is about C/C++ intrinsics. – Serge Rogatch Sep 18 '17 at 05:52

1 Answers1

4

Yes, the only SIMD bitwise NOT is PXOR/XORPS with all-ones, in MMX, SSE*, and AVX1/2.

AVX512F can avoid the need for a separate vector constant using vpternlogd same,same,same, with the immediate 0x55. (See my answer on the duplicate for more details about it vs. vpxord: Is NOT missing from SSE, AVX?)


Ideally you can arrange your algorithm to avoid actually needing to NOT something. For example, using PANDN instead of PAND. Or invert later as part of something else. But if you do end up needing to invert, that's how.

The all-ones constant can be generated with vpcmpeqd same,same,same. With intrinsics, let the compiler do this for you by writing _mm256_set1_epi32(-1). (Element size is obviously irrelevant for set1(-1), use whatever makes semantic sense for your algorithm.)

BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Yes, I end up with a need for `NOT` because there is no symmetric intrinsic for `_mm256_cmpgt_epi64()`, and then I need to `OR` the result, so no way for `AND_NOT` – Serge Rogatch Sep 18 '17 at 06:03
  • @SergeRogatch: Sometimes you can rearrange things to negate an input before a compare, but yeah you can't just reverse the operands if the `==` case is special. – Peter Cordes Sep 18 '17 at 06:31
  • Posted an more about AVX512F `vpternlogd` on the duplicate: https://stackoverflow.com/questions/42613821/is-not-missing-from-sse-avx/46273536#46273536. I'd delete this but I can't because it's the accepted answer. – Peter Cordes Sep 18 '17 at 08:07