2

I'm looking at some assembly code and I'm seeing tzcntl. A search for that instruction redirects to lzcnt. Are these the same instructions? Is it possible to use lzcnt with gcc?

I've seen this example: Intrinsic __lzcnt64 returns different values with different compile options

Although I'm confused about whether or not I need to use __lzcnt64 or if there is a 32 bit version.

So in summary:

  1. What's the difference between tzcntl and lzcnt, if any?
  2. How to properly use lzcnt with gcc (code, includes, and compiling)
  3. Can I select a 32 bit or 64 bit version?
Jimbo
  • 2,886
  • 2
  • 29
  • 45

1 Answers1

6

tzcnt counts trailing zeros, while lzcnt counts leading zeros.

The x86 compiler built-ins provide access to lzcnt instructions for various register widths:

unsigned short __builtin_ia32_lzcnt_u16(unsigned short);
unsigned int __builtin_ia32_lzcnt_u32(unsigned int);
unsigned long long __builtin_ia32_lzcnt_u64 (unsigned long long);

But these are only available with -mlzcnt and will give wrong results if the CPU doesn't support executing rep bsr as lzcnt.

But you can use the generic built-ins for bit counting. See the GCC documentation:

Built-in Function: int __builtin_clzll (unsigned long long)

Similar to __builtin_clz, except the argument type is unsigned long long.

Florian Weimer
  • 32,022
  • 3
  • 48
  • 92
  • I don't think we should advertise `__builtin_ia32_*` too much, they are implementation details for intrinsics like `__lzcnt32` (include `x86intrin.h`). Indeed the generic builtins are better, unless you require a specific behavior when called on `0`. – Marc Glisse Nov 12 '17 at 09:10
  • @MarcGlisse As linked in the answer I was able to find something like __builtin_ia32_lzcnt_u32. From the other SO question linked in mine above, I saw __lzcnt64. I probably should have tried __lzcnt32 to see if that worked. How would I know about __lzcnt32 (and __lzcnt64) in general? – Jimbo Nov 12 '17 at 12:55
  • `__lzcnt32` are documented in the Intel manuals, but the Intel intrinsics are not really useful with GCC because of the compiler flags required for them. – Florian Weimer Nov 12 '17 at 13:03
  • What do you mean about the required compiler flags? You can put `__attribute__((target("lzcnt")))` on the function directly using the intrinsic if you don't want to pass it on the command line. If you want `__builtin_clz` to expand to this instruction, you'll need something similar. – Marc Glisse Nov 12 '17 at 13:24
  • On older compiler versions, the Intel intrinsic headers [do not provide definitions if the `-m` flags are missing](https://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html). – Florian Weimer Nov 12 '17 at 13:52
  • Yeah, but that's been fixed a while ago, I don't think that's a reason to discourage use in new code. – Marc Glisse Nov 12 '17 at 21:08