11

This question is not a duplicate of Count the number of set bits in a 32-bit integer. See comment by Daniel S. below.

--

Let's say there is a variable int x;. Its size is 4 bytes, i.e. 32 bits.

Then I assign a value to this variable, x = 4567 (in binary 10001 11010111), so in memory it looks like this:

00000000 00000000 00010001 11010111

Is there a way to get the length of the bits which matter. In my example, the length of bits is 13 (I marked them with bold). If I use sizeof(x) it returns 4, i.e. 4 bytes, which is the size of the whole int. How do I get the minimum number of bits required to represent the integer without the leading 0s?

Daniel S.
  • 6,458
  • 4
  • 35
  • 78
qazerty23
  • 411
  • 2
  • 6
  • 16
  • 1
    That's a popular question. Have you ever tried google search? – Maxim Egorushkin Apr 01 '15 at 10:44
  • 1
    relevant to http://stackoverflow.com/questions/109023/how-to-count-the-number-of-set-bits-in-a-32-bit-integer – ha9u63a7 Apr 01 '15 at 10:45
  • 4
    The logarithm (base 2)? – CompuChip Apr 01 '15 at 10:46
  • 1
    Are you trying to [count leading zeroes](http://stackoverflow.com/questions/2589096/find-most-significant-bit-left-most-that-is-set-in-a-bit-array)? – Dmitry Grigoryev Apr 01 '15 at 10:47
  • Sorry, tried to google it, but couldn't find the answer. Thank you for the link ha9u63ar, I will read there! – qazerty23 Apr 01 '15 at 10:50
  • Note that counting the number of set bits and the highest set bit are two different things. The latter is much easier. – CompuChip Apr 01 '15 at 10:51
  • Whats the length of `0` ? It's always important to consider the _edge cases_ of your problems.. – MSalters Apr 01 '15 at 14:02
  • 1
    @LightnessRacesInOrbit - This may have been a premature close. This question is C++, and it could include template meta-programming techniques. The cited dup is also a slightly different question. – jww Sep 24 '17 at 01:51
  • @jww: There are C++ answers, including those that provide template meta-programming techniques. No reason to ask again. – Lightness Races in Orbit Sep 25 '17 at 21:55
  • 1
    This question is not a duplicate of the question "How to count the number of set bits in a 32-bit integer?". User jww has mentioned this before. See his comment. This question here asks for the bitlength, i.e. the index of the most significant set bit + 1, while the linked question asks for the popcount, i.e. the Hamming Weight. These are different things. The link is wrong and confusing. – Daniel S. Aug 17 '22 at 12:35
  • If this question really is a duplicate of some other question, then it should remain closed, but get linked to the correct predating duplicate or better quality duplicate. If there is no duplicate or no higher quality duplicate, then this question should be reopened and remain open. – Daniel S. Aug 17 '22 at 12:36
  • This question is related to counting the number of leading zero bits and in turn to the often available intrinsic `clz` as in `uint32_t count_of_leading_0_bits(const uint32_t &x) { return (x == 0) ? 32 : __builtin_clz(x); }` and then `uint32_t bitlen(const uint32_t &x) { return 32 - count_of_leading_0_bits(x); }` – Daniel S. Aug 17 '22 at 12:36
  • Found at least one [candidate](https://stackoverflow.com/q/47725337/212858) but it's newer, and I have no strong feeling about which should be closed as a dupe of the other. – Useless Nov 09 '22 at 12:34

5 Answers5

9
unsigned bits, var = (x < 0) ? -x : x;
for(bits = 0; var != 0; ++bits) var >>= 1;

This should do it for you.

CompuChip
  • 9,143
  • 4
  • 24
  • 48
grimble
  • 91
  • 3
  • Excellent ! Short, simple, and fully portable ! Just a small issue in case of a negative number because of the sign bit – Christophe Apr 01 '15 at 11:50
  • ... which can be fixed by writing `var = (x < 0) ? -x : x`. – CompuChip Apr 01 '15 at 13:15
  • 1
    I'd expect `-1` to have length 32 (or `sizeof(int)*CHAR_BIT)`), not 1. – MSalters Apr 01 '15 at 14:01
  • The result will be 32, if `-1` is converted to `unsigned`. (Assuming two's complement representation of signed numbers) – grimble Apr 01 '15 at 14:16
  • Explain? (the original solution was without the ternary operator, if that matters) – grimble Apr 01 '15 at 14:23
  • The interesting corner case is `var = 0`, in which case `bits = 0`. That is technically correct, though few humans would write zero as `''` instead of `'0'`. – knia Dec 14 '22 at 19:09
7

Warning: math ahead. If you are squeamish, skip ahead to the TL;DR.

What you are really looking for is the highest bit that is set. Let's write out what the binary number 10001 11010111 actually means:

x = 1 * 2^(12) + 0 * 2^(11) + 0 * 2^(10) + ... + 1 * 2^1 + 1 * 2^0

where * denotes multiplication and ^ is exponentiation.

You can write this as

2^12 * (1 + a)

where 0 < a < 1 (to be precise, a = 0/2 + 0/2^2 + ... + 1/2^11 + 1/2^12).

If you take the logarithm (base 2), let's denote it by log2, of this number you get

log2(2^12 * (1 + a)) = log2(2^12) + log2(1 + a) = 12 + b.

Since a < 1 we can conclude that 1 + a < 2 and therefore b < 1.

In other words, if you take the log2(x) and round it down you will get the most significant power of 2 (in this case, 12). Since the powers start counting at 0, the number of bits is one more than this power, namely 13. So:

TL;DR:

The minimum number of bits needed to represent the number x is given by

numberOfBits = floor(log2(x)) + 1
CompuChip
  • 9,143
  • 4
  • 24
  • 48
2

The portable modern way since C++20 should probably use std::countl_zero, like

#include <bit>

int bit_length(unsigned x)
{
    return (8*sizeof x) - std::countl_zero(x);
}

Both gcc and clang emit a single bsr instruction on x86 for this code (with a branch on zero), so it should be pretty much optimal.

Note that std::countl_zero only accepts unsigned arguments though, so deciding how to handle your original int parameter is left as an exercise for the reader.

Useless
  • 64,155
  • 6
  • 88
  • 132
1

You're looking for the most significant bit that's set in the number. Let's ignore negative numbers for a second. How can we find it? Well, let's see how many bits we need to set to zero before the whole number is zero.

00000000 00000000 00010001 11010111
00000000 00000000 00010001 11010110
                                  ^
00000000 00000000 00010001 11010100
                                 ^
00000000 00000000 00010001 11010000
                                ^
00000000 00000000 00010001 11010000
                               ^
00000000 00000000 00010001 11000000
                              ^
00000000 00000000 00010001 11000000
                             ^
00000000 00000000 00010001 10000000
                            ^
...
                       ^
00000000 00000000 00010000 00000000
                      ^
00000000 00000000 00000000 00000000
                     ^

Done! After 13 bits, we've cleared them all. Now how do we do this? Well, the expression 1<< pos is the 1 bit shifted over pos positions. So we can check if (x & (1<<pos)) and if true, remove it: x -= (1<<pos). We can also do this in one operation: x &= ~(1<<pos). ~ gets us the complement: all ones with the pos bit set to zero instead of the other way around. x &= y copies the zero bits of y into x.

Now how do we deal with signed numbers? The easiest is to just ignore it: unsigned xu = x;

MSalters
  • 173,980
  • 10
  • 155
  • 350
1

Many processors provide an instruction for calculating the number of leading zero bits directly (e.g. x86 has lzcnt / bsr and ARM has clz). Usually C++ compilers provide an intrinsic for accessing one of these instructions. The number of leading zeros can then be used to calculate the bit length.

In GCC, the intrinsic is called __builtin_clz. It counts the number of leading zeros for a 32 bit integer.
However, there is one caveat about __builtin_clz. When the input is 0, then the result is undefined. Therefor we need to take care of this special case. This is done in the following function with (x == 0) ? 32 : ..., which gives the result 32 when x is 0:

uint32_t count_of_leading_0_bits(const uint32_t &x) {
    return (x == 0) ? 32 : __builtin_clz(x);
}

The bit length can then be calculated from the number of leading zeros:

uint32_t bitlen(const uint32_t &x) {
    return 32 - count_of_leading_0_bits(x);
}

Note that other C++ compilers have different intrinsics for counting the number of leading zero bits, but you can find them quickly with a search on the internet. Here is How to use MSVC intrinsics to get the equivalent of this GCC code? for an equivalent with MSVC.

Daniel S.
  • 6,458
  • 4
  • 35
  • 78
  • cf. https://stackoverflow.com/questions/25683690/confusion-about-bsr-and-lzcnt -- Maybe the solution with `x==0` to take care of the special case can be replaced with something faster. Results from compiler explorer would also be nice. Feel free to add to my answer. – Daniel S. Nov 09 '22 at 09:16
  • 1
    As of C++20 there's a `countl_zero` function that translates directly to `bsr` (plus handling zero), so no need for GCC builtins. – Useless Nov 09 '22 at 12:26