I have a bitset of length n
. say 0100010010
How ta claculate all its 1
's not reading all the 0
's (faster than in O(n))?

- 772
- 10
- 62
- 134
-
What language? Or do you just want a generic algorithm? – nurdyguy May 06 '16 at 16:56
-
2create pre-calculated table for all possible variants, then you could have either avg O(1) if it is hash map or O(logN) for binary search – Iłya Bursov May 06 '16 at 16:57
-
1I think your question has been already solved [here](http://stackoverflow.com/questions/3815165/how-to-implement-bitcount-using-only-bitwise-operators) – gr1ph00n May 06 '16 at 17:02
-
1@Lashane that makes no sense, you are still going to have to read the input, don't you? – amit May 06 '16 at 17:02
-
@amit reading of the input usually is not included into big O analysis, because it is not part of algorithm – Iłya Bursov May 06 '16 at 17:03
-
1@Lashane That makes no sense, but let's assume you don't need to read it. How long does it take to find the hash value of `n` bits, where any change in any bit can change the result? – amit May 06 '16 at 17:05
-
@amit finding hash value depends on hash function, there is no requirement for hash functions to generate different results in case of single bit change, simplest hash function could return length of passed value as hash, and it will be correct hash function – Iłya Bursov May 06 '16 at 17:06
-
1true. still, you're talking nonsense. – Karoly Horvath May 06 '16 at 17:07
-
@Lashane a good hash function does it, if you are going to assume you will get a good hash function (with low collisions) by reading only a constant number of bits and using them to find the result - you are wrong. – amit May 06 '16 at 17:07
-
@amit I'm not saying that it is good function with low collisions, I'm saying that it is correct hash function – Iłya Bursov May 06 '16 at 17:08
-
@Lashane You said it will take O(1) time to query, how will it take O(1) time to query if there are a lot of collisions? And if you have even a single collision - you are going to read the entire bitset to determine which one you are looking for. – amit May 06 '16 at 17:10
-
1Forget about "read". You have to *process* each bit. Period. That's O(n). – Karoly Horvath May 06 '16 at 17:10
-
@amit I said avg O(1) – Iłya Bursov May 06 '16 at 17:10
-
@Lashane Do you think you are going to have average number of `O(1/n)` collisions when using only constant number of bits for hashing? Please show it to me. – amit May 06 '16 at 17:13
-
@Lashane Using constant number of bits -> you have constant number of possible output values for your hash function -> number of buckets is constant -> `lim_(n->infinity)n/#buckets = infinity` – amit May 06 '16 at 17:16
-
@amit depends on how much memory I have and how many bits I'll use for hashing, and what is the maximal value for N (it cannot be infinite in real wold), so I'd use 1024 petabytes of ram and probably 1024 bits for hash... – Iłya Bursov May 06 '16 at 17:17
-
@KarolyHorvath I am not sure if I am being trolled :( – amit May 06 '16 at 17:18
-
@Lashane This is not how asymptotic complexity works. If the size of the input is bounded by a constant, a linear scan on it is also constant time. And it has the same logic as your suggested solution. – amit May 06 '16 at 17:21
-
@amit ok, lets try another way - what is the complexity of bubble sort? – Iłya Bursov May 06 '16 at 17:46
-
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/111250/discussion-between-lashane-and-amit). – Iłya Bursov May 06 '16 at 17:47
-
Just ignore him. @Lashane: please don't litter this Q with your nonsense. If you still don't get it, please post a separate Question. – Karoly Horvath May 06 '16 at 18:08
-
@amit anyway, what I want to say - we already have algorithms which can calculate number of bits set in O(1) time (via lookup table for example), for N (number of bits)=const, which means that we can extend this algorithms for N*2 and then to N*4 and so on, they still will be O(1), of course actual time will increase (and at some point this time will be longer that pure O(N) counting solution), but time will not depend on N (and this is definition of O(1)). Also, memory requirements for these algorithm could be impractically large. Still these algorithms will have time complexity O(1). – Iłya Bursov May 06 '16 at 18:10
3 Answers
No, there's no sub-linear method for counting. But there is considerable difference in constant factor between the O(n) counting methods. A bitset should be implemented internally using the native word size of the host architecture, and, ideally, utilize a native bit-counting instruction if available.
This is an efficient, software-based population count function from Hacker's Delight 5-2, for a 32-bit number. You can extend the technique to larger integer types, and for an arbitrary-precision numeric type, you can apply this method to the components of the number and add them together.
int count(unsigned i) {
i = i - ((i >> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
i = (i + (i >> 4)) & 0x0f0f0f0f;
i = i + (i >> 8);
i = i + (i >> 16);
return i & 0x3f;
}

- 265,237
- 58
- 395
- 493
-
@KarolyHorvath in this particular case complexity is O(1), because N is constant here (32 or 64) – Iłya Bursov May 06 '16 at 17:12
-
The question explicitly says the bitset is of size `n` (which is not constant) though. – amit May 06 '16 at 17:14
-
@Lashane: yeah, that's O(n). O(32) is O(1). No contradiction here. – Karoly Horvath May 06 '16 at 17:14
-
@amir question says about variable length, but this answer provides solution for constant N, this is why this particular solution maybe not useful for OP problem, but it has complexity O(1) – Iłya Bursov May 06 '16 at 17:15
-
Yes, for an arbitrary precision number or a bitset, the complexity is still O(n). I don't think there's a way around that. – erickson May 06 '16 at 17:20
-
@erickson There isn't but this answer, as nice as it is for constant number of bits, does not answer the question. – amit May 06 '16 at 17:22
-
I think faster would be handle bitset as `BYTE` array, create `LUT[256]` of bitcounts and simply sum the bitcounts for each `BYTE`. like `for (sum=0,i=0;i
>3;i++) sum+=LUT[data[i]];` but your point stays this is just constant time reduction. – Spektre May 07 '16 at 07:01
In machines which have a popcnt() instruction that is the most efficient way. There are a variety of tricks used for this in other cases, and the most efficient way might depend on your particular machine. See e.g. https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan and other sections near there.

- 19,301
- 2
- 19
- 25
Unless you have some other knowledge, or building this bitset incrementally, sublinear time is obviously impossible.
The input itself is of size O(n)
, and you will have to read every bit of it in order to produce the correct answer, resulting in Omega(n)
lower bound.

- 175,853
- 27
- 231
- 333