3

I'm writing an x86 assembly program and I want to check a register (it is not 0), to see if more than one bit is on.

Is there a simple way of doing it or should I just loop and shift until I see a 2nd set bit or get to the end?


I don't need the total number of set bits, that would be

Is there something faster than doing one of those and checking for popcnt(x) > 1?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Quenyrous
  • 31
  • 3
  • Related question: https://stackoverflow.com/questions/27050583/hamming-weight-number-of-1-in-a-number-mixing-c-with-assembly – Michael Jun 16 '22 at 14:35
  • 1
    Check out `popcnt`. – Seva Alekseyev Jun 16 '22 at 14:38
  • 2
    All register values with just a single bit set are powers of two. IIRC you can check `(reg - 1) & reg == 0` to find out whether the value is a power of two. – ecm Jun 16 '22 at 15:14
  • 1
    `(reg - 1) & reg` is available as `blsr` on modern CPUs (check the zero flag afterwards) – harold Jun 16 '22 at 16:34

1 Answers1

5

As almost commented by ecm, one can check if a value is a power of two by checking

a) value != 0 &&
b) ((value - 1) & value) == 0

However, if one needs to check if a value has more than 1 bit, it is enough to test that

c) ((value - 1) & value) != 0

lea  ebx, [eax - 1]
test eax, ebx ;  // Z flag is cleared on EAX having 2 or more bits

That is: if a value is non-zero after some bit (that was originally set) has been cleared, it must have had more than 1 bit set.

BMI1 extension has a single instruction: clear least significant bit blsr, which also sets the ZF accordingly.

Sep Roland
  • 33,889
  • 7
  • 43
  • 76
Aki Suihkonen
  • 19,144
  • 1
  • 36
  • 57
  • 1
    AMD CPUs (up to Zen3 at least) run `blsr` as 2 uops (https://uops.info/), so if you're going to branch on it, `lea`/`test` is actually *more* efficient there. The `test` can fuse with a `jnz`, so it's 2 uops total. (Same thing on Intel: `blsr` / `jnz` would be 2 uops total, same as `lea` / `test+jnz`.) It does save code-size, and if you're going to `setnz` or `cmovnz` then it's break-even on AMD, and a win for `blsr` on Intel. – Peter Cordes Jun 16 '22 at 23:36
  • And apparently clang knows that, adding 1 is as good as subtracting 1 in that expression (gcc doesn't seem to know, though). https://godbolt.org/z/aKreaaars – Erik Eidt Jun 18 '22 at 00:41
  • 2
    @ErikEidt: Apparently the formulas didn't have enough parenthesis: the purpose was not to write `(x-1) & (x !=0)`, but `((x-1) & x) != 0`. – Aki Suihkonen Jun 18 '22 at 03:13