7

I just want to find some fastest set bits count function in the php.

For example, 0010101 => 3, 00011110 => 4

I saw there is good Algorithm that can be implemented in c++. How to count the number of set bits in a 32-bit integer?

Is there any php built-in function or fastest user-defined function?

Community
  • 1
  • 1
Danil Chernokalov
  • 755
  • 1
  • 10
  • 31

6 Answers6

9

You can try to apply a mask with a binary AND, and use shift to test bit one by one, using a loop that will iterate 32 times.

function getBitCount($value) {

    $count = 0;
    while($value)
    {
        $count += ($value & 1);
        $value = $value >> 1;
    }

    return $count;
}

You can also easily put your function into PHP style

function NumberOfSetBits($v)
{
    $c = $v - (($v >> 1) & 0x55555555);
    $c = (($c >> 2) & 0x33333333) + ($c & 0x33333333);
    $c = (($c >> 4) + $c) & 0x0F0F0F0F;
    $c = (($c >> 8) + $c) & 0x00FF00FF;
    $c = (($c >> 16) + $c) & 0x0000FFFF;
    return $c;
}
MatRt
  • 3,494
  • 1
  • 19
  • 14
  • Indeed, little mistake, it is a BINARY AND – MatRt May 31 '13 at 02:54
  • NumberOfSetBits() function is faster ten times that getBitCount(), but if $i = 1025, then it returns 258, can you fix this problem? My Input limit is 2000. – Danil Chernokalov May 31 '13 at 03:12
  • Indeed, it seems to be an error in this algo, check my update. – MatRt May 31 '13 at 03:20
  • Be careful with the first approach. Any number higher than 0x7FFFFFFF (negative values) might cause an infinite loop (at least on 32-bit PHP). – user966939 May 06 '15 at 21:45
  • getBitCount doesn't work with negative numbers it is infinite loop (using 32 bit integers). As you don't mention this in the answer I should vote you down. But I will not do it coz you bring the next approach. – John Boe Mar 29 '16 at 18:19
  • You cant look at little faster solution than NumberOfSetBits here https://stackoverflow.com/a/58087748/12078462 – Anton Mikhalev May 26 '22 at 18:50
3

I could figure out a few ways to but not sure which one would be the fastest :

  • use substr_count()
  • replace all none '1' characters by '' and then use strlen()
  • use preg_match_all()

PS : if you start with a integer these examples would involve using decbin() first.

Daniel P
  • 471
  • 3
  • 9
3

There are a number of other ways; but for a decimal 32 bit integer, NumberOfSetBits is definitely the fastest.

I recently stumbled over Brian Kernighan´s algorithm, which has O(log(n)) instead of most of the others having O(n). I don´t know why it´s not appearing that fast here; but it still has a measurable advantage over all other non-specialized functions.

Of course, nothing can beat NumberOfSetBits with O(1).

my benchmarks:

function getBitCount($value) { $count = 0; while($value) { $count += ($value & 1); $value = $value >> 1; } return $count; }

function getBitCount2($value) { $count = 0; while($value) { if ($value & 1)$count++; $value >>= 1; } return $count; }
// if() instead of +=; >>=1 instead of assignment: sometimes slower, sometimes faster
function getBitCount2a($value) { for($count = 0;$value;$value >>= 1) if($value & 1)$count ++; return $count; }
// for instead of while: sometimes slower, sometimes faster

function getBitCount3($value) { for($i=1,$count=0;$i;$i<<=1) if($value&$i)$count++; return $count; }
// shifting the mask: incredibly slow (always shifts all bits)
function getBitCount3a($value) { for($i=1,$count=0;$i;$i<<=1) !($value&$i) ?: $count++; return $count; }
// with ternary instead of if: even slower

function NumberOfSetBits($v) {
// longest (in source code bytes), but fastest
    $c = $v - (($v >> 1) & 0x55555555); $c = (($c >> 2) & 0x33333333) + ($c & 0x33333333);
    $c = (($c >> 4) + $c) & 0x0F0F0F0F; $c = (($c >> 8) + $c) & 0x00FF00FF;
    $c = (($c >> 16) + $c) & 0x0000FFFF;    return $c;
}

function bitsByPregReplace($n) { return strlen(preg_replace('_0_','',decbin($n))); }
function bitsByNegPregReplace($n) { return strlen(preg_replace('/[^1]/','',decbin($n))); }
function bitsByPregMatchAll($n) { return preg_match_all('/1/',decbin($n)); }

function bitsBySubstr($i) { return substr_count(decbin($i), '1'); }
function bitsBySubstrInt($i) { return substr_count(decbin($i), 1); }
// shortest (in source code bytes)

function bitsByCountChars($n){ return count_chars(decbin($n))[49]; }
// slowest by far
function bitsByCountChars1($n) { return count_chars(decbin($n),1)[49]; }
// throws a notice for $n=0

function Kernighan($n) { for(;$n;$c++)$n&=$n-1;return$c; }
// Brian Kernighan’s Algorithm

function benchmark($function)
{
    gc_collect_cycles();
    $t0=microtime();
    for($i=1e6;$i--;) $function($i);
    $t1=microtime();
    $t0=explode(' ', $t0); $t1=explode(' ', $t1);
    echo ($t1[0]-$t0[0])+($t1[1]-$t0[1]), " s\t$function\n";
}

benchmark('getBitCount');
benchmark('getBitCount2');
benchmark('getBitCount2a');
benchmark('getBitCount3');
benchmark('getBitCount3a');
benchmark('NumberOfSetBits');
benchmark('bitsBySubstr');
benchmark('bitsBySubstrInt');
benchmark('bitsByPregReplace');
benchmark('bitsByPregMatchAll');
benchmark('bitsByCountChars');
benchmark('bitsByCountChars1');
benchmark('decbin');

banchmark results (sorted)

> php count-bits.php
 2.286831 s     decbin

 1.364934 s     NumberOfSetBits
 3.241821 s     Kernighan

 3.498779 s     bitsBySubstr*
 3.582412 s     getBitCount2a
 3.614841 s     getBitCount2
 3.751102 s     getBitCount
 3.769621 s     bitsBySubstrInt*

 5.806785 s     bitsByPregMatchAll*
 5.748319 s     bitsByCountChars1*
 6.350801 s     bitsByNegPregReplace*
 6.615289 s     bitsByPregReplace*

13.863838 s     getBitCount3
16.39626 s      getBitCount3a
19.304038 s     bitsByCountChars*

Those are the numbers from one of my runs (with PHP 7.0.22); others showed different order within the 3.5 seconds group. I can say that - on my machine - four of those five are pretty equal, and bitsBySubstrInt is always a little slower due to the typecasts.

Most other ways require a decbin (which mostly takes longer than the actual counting; I marked them with a * in the benchmark results); only BitsBySubstr would get close to the winner without that gammy leg.

I find it noticeable that you can make count_chars 3 times faster by limiting it to only existing chars. Seems like array indexing needs quite some time.


edit:

  • added another preg_replace version
  • fixed preg_match_all version
  • added Kernighan´s algorithm (fastest algorithm for arbitrary size integers)
  • added garbage collection to benchmarking function
  • reran benchmarks
Titus
  • 452
  • 8
  • 19
2

My benchmarking code

start_benchmark();
for ($i = 0; $i < 1000000; $i++) {
    getBitCount($i);
}
end_benchmark();

start_benchmark();
for ($i = 0; $i < 1000000; $i++) {
    NumberOfSetBits($i);
}
end_benchmark();
start_benchmark();
for ($i = 0; $i < 1000000; $i++) {
    substr_count(decbin($i), '1');
}
end_benchmark();

Benchmarking result:

benchmark (NumberOfSetBits()) : 1.429042 milleseconds

benchmark (substr_count()) : 1.672635 milleseconds

benchmark (getBitCount()): 10.464981 milleseconds

I think NumberOfSetBits() and substr_count() are best. Thanks.

Danil Chernokalov
  • 755
  • 1
  • 10
  • 31
  • PHP has always been 'weirdly' fast with strings, if you limit your benchmark to smaller numbers you will see that `substr_count(decbin($i), '1');` will become faster then `NumberOfSetBits()`. – Daniel P May 31 '13 at 07:18
2

This option is a little faster than NumberOfSetBits($v)

function bitsCount(int $integer)
{
    $count = $integer - (($integer >> 1) & 0x55555555);
    $count = (($count >> 2) & 0x33333333) + ($count & 0x33333333);
    $count = ((((($count >> 4) + $count) & 0x0F0F0F0F) * 0x01010101) >> 24) & 0xFF;

    return $count;
}

Benckmark (PHP8)

1.78 s  bitsBySubstr
1.42 s  NumberOfSetBits
1.11 s  bitsCount
Anton Mikhalev
  • 303
  • 3
  • 8
0

Here is another solution. Maybe not the fastet but therefor the shortest solution. It also works for negative numbers:

function countBits($num)
{
   return substr_count(decbin($num), "1");
}
zomega
  • 1,538
  • 8
  • 26