-2

So I've written this function to count the number of bits in a long, which for my purposes includes zeros to the right of the MSB and excludes zeros to its left:

int bitCount(unsigned long bits)
{
    int len = 64;
    unsigned long mask = 0x8000000000000000;
    while ((bits & mask) == 0 && len > 0){
        mask >>= 1;
        --len;
    }
    return len;
}

This function works fine for me as far as returning a correct answer, but is there a better (faster or otherwise) way to go about doing this?

Ian R
  • 39
  • 4
  • some architectures have an intrinsic `__popcnt`. – Eugene Sh. Apr 07 '16 at 14:31
  • 3
    A `long` is not guaranteed to have 64 bits! – too honest for this site Apr 07 '16 at 14:31
  • 1
    Did you search? This is no consulting site. – too honest for this site Apr 07 '16 at 14:32
  • Why would you ever count the number of bits in a 64 bit number? The only situation where I ever worried about numbers of bits set was in a _huge_ bit array. And counting the number of bits in a 64 bit number is a very inefficient building block for that. – gnasher729 Apr 07 '16 at 14:33
  • On a normal computer with 8-bit bytes, the number of bits of any primitive integer type is `sizeof(type) * 8`. – Some programmer dude Apr 07 '16 at 14:38
  • As for @Olaf's comment about `long` not always being 64 bits, I suggest you *don't* try running that code on a 32-bit system, *or* in a program built using the Visual C++ compiler (even on a 64-bit system). – Some programmer dude Apr 07 '16 at 14:41
  • Possible duplicate of [Count number of bits in a 64-bit (long, big) integer?](http://stackoverflow.com/questions/2709430/count-number-of-bits-in-a-64-bit-long-big-integer) – phuclv Apr 07 '16 at 14:45
  • 2
    More generally, the number of bits in any type is `sizeof(type)*CHAR_BIT`, where `CHAR_BIT` is defined in `` and is the (implementation-defined) number of bits in a `char`. – Peter Apr 07 '16 at 14:45
  • Your title doesn't match your code... what is it you want to do? It seems you try to find the highest numbered bit being `1` – Support Ukraine Apr 07 '16 at 14:46
  • just some modifications to the algorithm for 32 bit is needed. This was askedd so many times [How to count the number of set bits in a 32-bit integer?](http://stackoverflow.com/q/109023/995714). And what you mean is "counting number of **set** bits", not counting number of bits which is always `sizeof(long)*CHAR_BIT` – phuclv Apr 07 '16 at 14:47
  • That would be any implementation using the Windows-API, not only MSVC (the question is tagged C). And what is the problem using the correct type or at least calculating the width correctly: `sizeof(unsigned long) * CHAR_BIT`. Ignoring padding bits would be an acceptable compromise. – too honest for this site Apr 07 '16 at 14:54
  • 2
    @LưuVĩnhPhúc: To be pedantic: That formula gives the total number of bits reserved for the type, but there might be padding bits (see my comment above, too). – too honest for this site Apr 07 '16 at 14:57

4 Answers4

0

If you want to count the number of bits in a long type, I suggest you use ULONG_MAX from the <limits.h> header file, and use the right shift operator >> to count the number of one-bits. This way you don't have to actually know the number of bits beforehand.

Something like

unsigned long value = ULONG_MAX;
unsigned count = 1;

while (value >>= 1)
    ++count;

This works because the right shift fills up with zeroes.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
0

The general answer for the number of bits in any type is CHAR_BIT*sizeof(type). CHAR_BIT, defined in <limits.h> is the (implementation-defined) number of bits in a char. sizeof(type) is specified in a way that yields the number of chars used to represent the type (i.e. sizeof(char) is 1).

Peter
  • 35,646
  • 4
  • 32
  • 74
0

The solutions the other guys proposed are very nice and probably the shortest to write and remain understandable. Another straight forward approach would be something like this

int bitCountLinear(long int n) {
    int len = sizeof(long int)*8;
    for (int i = 0; i < len; ++i) 
        if ((1UL<<i) > (unsigned long int)n) 
            return i;
    return len; 
}

The rest might get a bit extreme but I gave it a try so I'll share it. I suspected that there might be arguably faster methods of doing this. eg Using binary search (even though a length of 64bits is extremely small). So I gave it a quick try for you and for the fun of it.

union long_ing_family {
    unsigned long int uli;
    long int li;
};

int bitCountLogN(long int num) {
    union long_ing_family lif;
    lif.li = num;
    unsigned long int n = lif.uli;
    int res;
    int len = sizeof(long int)*8-1;
    int max = len;
    int min = 0;

    if (n == 0) return 0;
    do {
        res = (min + max) / 2;
        if (n < 1UL<<res)
            max = res - 1;
        else if (n >= (1UL<<(res+1)))
            min = res + 1;
        else
            return res+1;
    } while (min < max);

    return min+1;   // or max+1
}

I then timed both to see if they have any interesting differences...

#include <stdio.h>

#define REPS 10000000

int bitCountLinear(long int n);
int bitCountLogN(long int num);
unsigned long int timestamp_start(void);
unsigned long int timestamp_stop(void);
union long_ing_family;

int main(void) {

    long int n;
    long int begin, end;
    long int begin_Lin, end_Lin;
    long int begin_Log, end_Log;

    begin_Lin = 0;
    end_Lin = 0;
    begin_Log = 0;
    end_Log = 0;

    for (int i = 0; i < REPS; ++i) {
        begin_Lin += timestamp_start();
        bitCountLinear(i);
        end_Lin += timestamp_stop();
    }
    printf("Linear: %lu\n", (end_Lin-begin_Lin)/REPS);

    for (int i = 0; i < REPS; ++i) {
        begin_Log += timestamp_start();
        bitCountLogN(i);
        end_Log += timestamp_stop();
    }
    printf("Log(n): %lu\n", (end_Log-begin_Log)/REPS);

}

unsigned long int timestamp_start(void) {
    unsigned int cycles_low;
    unsigned int cycles_high;
    asm volatile ("CPUID\n\t"
        "RDTSCP\n\t"
        "mov %%edx, %0\n\t"
        "mov %%eax, %1\n\t": "=r" (cycles_high), "=r" (cycles_low)::"%rax", "%rbx", "%rcx", "%rdx");
    return ((unsigned long int)cycles_high << 32) | cycles_low;
}

unsigned long int timestamp_stop(void) {
    unsigned int cycles_low;
    unsigned int cycles_high;
    asm volatile ("RDTSCP\n\t"
        "mov %%edx, %0\n\t"
        "mov %%eax, %1\n\t"
        "CPUID\n\t": "=r" (cycles_high), "=r" (cycles_low)::"%rax", "%rbx", "%rcx", "%rdx");
    return ((unsigned long int)cycles_high << 32) | cycles_low;
}

...and not surprisingly they didn't. On my machine I'll get numbers like Linear: 228 Log(n): 224 Which are not considered to be different assuming a lot of background noise.

Edit: I realized that I only tested the fastest solutions for the Linear approach so changing the function inputs to

bitCountLinear(0xFFFFFFFFFFFFFFFF-i);

and

bitCountLogN(0xFFFFFFFFFFFFFFFF-i);

On my machine I'll get numbers like Linear: 415 Log(n): 269 Which is clearly a win for the Log(n) method. I didn't expect to see a difference here.

alex10791
  • 434
  • 4
  • 11
0

You can count the number of bit 1 first:

int bitCount(unsigned long n)
{
    unsigned long tmp;
    tmp = n - ((n >> 1) & 0x7777777777777777)
            - ((n >> 2) & 0x3333333333333333)
            - ((n >> 3) & 0x1111111111111111);
    tmp = (tmp + (tmp >> 4)) & 0x0F0F0F0F0F0F0F0F;
    return 64 - tmp % 255;  // temp % 255 is number of bit 1
}

Take a look at the MIT HAKMEM Count.

iclosed
  • 1
  • 3