9

I'm trying to optimize some bit packing and unpacking routines. In order to do the packing I need to calculate the number of bits needed to store integer values. Here is the current code.

if (n == -1) return 32;
if (n == 0) return 1;
int r = 0;
while (n)
{
    ++r;
    n >>= 1;
}
return r;
hippietrail
  • 15,848
  • 18
  • 99
  • 158
Matt Wamboldt
  • 561
  • 1
  • 4
  • 10
  • 1
    I hope you're not trying to compress data and prefixing each value with the count of the number of bits in the value. – Skizz Apr 27 '10 at 13:01
  • 1
    When you write "32", I presume you mean sizeof( int ) * CHAR_BIT. Do you want your code to work where sizeof( int ) == 8? – William Pursell Apr 27 '10 at 13:13
  • 1
    If you're using >>, you really want n to be unsigned. – William Pursell Apr 27 '10 at 13:13
  • @Skizz: bitpacking is a very fast and effective technique... I don't know what exactly in his question makes you think he may be prefixing every single value with the bit count. – SyntaxT3rr0r Apr 27 '10 at 13:20
  • 1
    @WizardOfOdds: Well, I can't think of any reason why you'd want to calculate the number of bits requires to hold one integer. And even if there was a good reason to do this, you'd still need to record somewhere the number of bits used (how else could you decode it). Perhaps it's to reduce bandwidth over a slow link and the bit count is an upper limit for a block of data. But then there are far better ways to do this. I don't know, just trying to guess what the bigger picture is because there's a chance the OP is barking up the wrong tree. – Skizz Apr 27 '10 at 13:52
  • @Skizz: This is used with part of a header on a table. We use this with the min and max value to determine the number of bits for packing, and is calculated using the min and max saved in the header for unpacking. – Matt Wamboldt Apr 27 '10 at 14:03
  • @Matt: So you're storing a sequence of values as: `(value - min) & (number_of_bits (max-min))` and storing min and bits per value in a header? So you're trying to compress an array of integers? There are many algorithms out there that will do a better job of compressing than this. – Skizz Apr 27 '10 at 14:11
  • @Skizz: This is part of a large database system. It stores more than just integers, it also stores floats and strings. Multiple tables, and it has to squeeze in as little memory as is feasible. I'm trying to optimize the system, and replacing it isn't doable. – Matt Wamboldt Apr 27 '10 at 16:15
  • @Matt: It doesn't matter if they're ints, floats, strings or whatever, they're just bytes. Create a structure containing all the data you want to compress (a row of a table in the database say), then use something like zlib (http://www.zlib.net/) to compress an instance of the structure: `deflate (&instance, sizeof instance)` where instance is the structure containing all the data. You'll get far better compression using something like zlib, and it's been tested a lot. – Skizz Apr 27 '10 at 22:12
  • @Skizz: This isn't about compression, it's about calculating the number of bits to store a number. What if I want to generate code for bit packed structs, or as has been mentioned, calculate the log 2 of a number? – Matt Wamboldt Apr 28 '10 at 12:27
  • @Matt: Yes, the question is about counting bits, which is why these are comments rather than answers. Sometimes one gets obsessive over implementation details when it's the implementation that's at fault. Here, I wondered if the goal was to do compression, in which case, counting bits is very inefficent (poor compression ratio). It is unusual for algorithms to need to know bit counts. C++ does not allow dynamic bit field declarations, bit packed structures would need additional meta data to describe the layout, adding complexity. Counting bits would only give you an approximation of the log2. – Skizz Apr 28 '10 at 13:00
  • @Skizz: This is a large system, We're a few months away from shipping. Adding something like zlib would not only conflict with the way the database system stores and retrieves data, it would also cause need for massive retesting. I still think this is a viable way to store data as all we need is a few bit shifts to get the data, and we can keep the whole thing in memory. In short I ain't changing what ain't broke. Going back and forth on this is causing no small amount of stress over something that should't. – Matt Wamboldt Apr 28 '10 at 15:56
  • @Matt: If your implementation meets your requirements then you have nothing to worry about. I didn't mean to cause stress! – Skizz Apr 28 '10 at 19:08
  • See also http://stackoverflow.com/questions/680002/find-out-number-of-bits-needed-to-represent-a-positive-integer-in-binary – hippietrail Apr 17 '11 at 09:54

6 Answers6

10

Non-portably, use the bit-scan-reverse opcode available on most modern architectures. It's exposed as an intrinsic in Visual C++.

Portably, the code in the question doesn't need the edge-case handling. Why do you require one bit for storing 0? In any case, I'll ignore the edges of the problem. The guts can be done efficiently thus:

if (n >> 16) { r += 16; n >>= 16; }
if (n >>  8) { r +=  8; n >>=  8; }
if (n >>  4) { r +=  4; n >>=  4; }
if (n >>  2) { r +=  2; n >>=  2; }
if (n - 1) ++r;
Marcelo Cantos
  • 181,030
  • 38
  • 327
  • 365
6

You're looking to determine the integer log base 2 of a number (the l=highest bit set). Sean Anderson's "Bit Twiddling Hacks" page has several methods ranging from the obvious counting bits in a loop to versions that use table lookup. Note that most of the methods demonstrated will need to be modified a bit to work with 64-bit ints if that kind of portability is important to you.

Just make sure that any shifting you're using to work out the highest bit set needs to be done' on an unsigned version of the number since a compiler implementation might or might not sign extend the >> operation on a signed value.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • 1
    Didn't even realize what I was trying to do was calculate the log base 2. Kinda skipped logarithms in Math class. That site had some helpful stuff. Thanks – Matt Wamboldt Apr 27 '10 at 17:31
5

What you are trying to do is find the most significant bit. Some architectures have a special instruction just for this purpose. For those that don't, use a table lookup method.

Create a table of 256 entries, wherein each element identifies the upper most bit.

Either loop through each byte in the number, or use a few if-statements to break to find the highest order non-zero byte.

I'll let you take the rest from here.

Sparky
  • 13,505
  • 4
  • 26
  • 27
  • 1
    On a fast CPU with slow RAM (a PC!) this could be slower than just doing a for loop. The for loop will be consistently in the region of a hundred or so clock cycles whereas a memory lookup could take 10s of milliseconds if the data needs to be paged from disk (and then you're upsetting the cache so there's a knock on effect too). – Skizz Apr 27 '10 at 13:56
3

You would have to check the execution time to figure the granularity, but my guess is that doing 4 bits at a time, and then reverting to one bit at a time would make it faster. Log operations would probably be slower than logical/bit operations.

if (n < 0) return 32;
int r = 0;
while (n && 0x7FFFFFF0) {
  r+=4;
  n >>= 4; }
while (n) {
  r++;
  n >>= 1; }
return r;
Ron
  • 269
  • 1
  • 6
3

Do a binary search instead of a linear search.

if ((n >> 16) != 0)
{
    r += 16;
    n >>= 16;
}

if ((n >> 8) != 0)
{
    r += 8;
    n >>= 8;        
}

if ((n >> 4) != 0)
{
    r += 4;
    n >>= 4;        
}

// etc.

If your hardware has bit-scan-reverse, an even faster approach would be to write your routine in assembly language. To keep your code portable, you could do

#ifdef ARCHITECTURE_WITH_BSR
   asm // ...
#else
   // Use the approach shown above
#endif
dan04
  • 87,747
  • 23
  • 163
  • 198
2
number_of_bits = log2(integer_number)

rounded to the higher integer.

Kjuly
  • 34,476
  • 22
  • 104
  • 118
Yevhen
  • 1,897
  • 2
  • 14
  • 25
  • 1
    @Matteo, of course he want to optimize the code but this nitty-picky optimizations will leads to messed up code. taking log2 is a elegant, clean, readable and maintainable approach. – Mahes May 18 '11 at 18:55
  • 1
    @Mahes: the question states specifically "what is the fastest way" for bit packing/unpacking routines, that probably are called in inner loops, where performance *is* critical. Instead of a handful of assembly shift/bitwise instructions, this solution involves converting an integer to floating point (which is slow), some kind of table lookup (which may trigger a cache miss)/truncated series calculation (for the log) and a conversion back to int (again, quite slow). Malus points if it's an embedded platform without hardware FP, so every FP operation has to be emulated in software. – Matteo Italia May 18 '11 at 19:12