1

Numerics.BitOperations.PopCount in .NET is the fastest way I am aware of counting the number of bits set that compose an int. However, I haven't seen anyone using it to compute the number of bits set in a System.Collections.BitArray object (example: Counting bits set in a .Net BitArray Class).

Using the following C++ code as example:

#include <iostream>
#include <bitset>

using namespace std;

int main() {
  bitset<500000000> visited;
  long limit = 500000000;
  for (long x=0; x < limit; x++) {
    visited[x] = true;
  }
  cout << visited.count() << endl;
}

Running time: 0.36 real 0.34 user 0.02 sys

Here is my best attempt in F#:

open System
open System.Numerics

let limit = 500000000
let visited = Collections.BitArray(limit)
for x in 0..limit-1 do
    visited.[x] <- true
let ints = Array.create ((visited.Count >>> 5) + 1) 0u
visited.CopyTo(ints, 0)
ints
|> Array.map BitOperations.PopCount
|> Array.reduce Operators.(+) 
|> printfn "%d"

Running time: 1.53 real 1.47 user 0.06 sys

Is it possible to improve the .NET version to match better the C++ implementation?

UPDATE

Comparing only to the count part and changing high level functions with a for loop increases the gap from C++ being 4x faster to be 5x faster.

open System
open System.Numerics
let limit = 500000000
let visited = Collections.BitArray(limit)
visited.[9999999] <- true

let ints = Array.create ((visited.Count >>> 5) + 1) 0u
visited.CopyTo(ints, 0)
let mutable total = 0
for i in ints do
    total <- Operators.(+) total (BitOperations.PopCount(i)) 
printfn "%d" total

Running time: 0.21 real 0.15 user 0.06 sys

#include <iostream>
#include <bitset>

using namespace std;

int main() {
  bitset<500000000> visited;
  visited[9999999] = true;
  cout << visited.count() << endl;
}

Running time: 0.04 real 0.02 user 0.02 sys

UPDATE 2

Replacing BitArray with an uint32 array improves the performance from 5x to 4x slower than C++ bitset:

open System.Numerics
let limit = 500000000 >>> 5
let visited = Array.create (limit + 1) 0u
visited.[9999999] <- uint 0b00100000
let mutable total = 0
for i in visited do
    total <- Operators.(+) total (BitOperations.PopCount(i)) 
printfn "%d" total

Running time: 0.15 real 0.12 user 0.03 sys

Kali User
  • 105
  • 6
  • `System.Collections.BitArray` is **extremely old**. It's from [.Net 1.1](https://learn.microsoft.com/en-us/dotnet/api/system.collections.bitarray?view=netframework-1.1). It doesn't even implement `IEnumerable`. But what times are you comparing exactly? The performance of `visited.count()` vs `ints |> Array.map BitOperations.PopCount |> Array.reduce Operators.(+) `? Or the performance of the entire program? – dbc Apr 24 '21 at 18:02
  • Take a look at https://ericlippert.com/2013/05/14/benchmarking-mistakes-part-one/, https://ericlippert.com/2013/05/21/benchmarking-mistakes-part-two/ and https://ericlippert.com/2013/07/09/benchmarking-mistakes-part-three/ to make sure you're comparing apples to apples. – dbc Apr 24 '21 at 18:02
  • And I hate to say it, but you really should compare the time of `visited.count()` to the time taken by a simple `for` loop without any delegates or lambdas, to get an idea where the time is being taken. – dbc Apr 24 '21 at 18:04
  • Thank you @dbc, I have updated the question to include your recommendations to focus only on `visited.count()` vs `BitOperations.PopCount` but now the gap is even bigger between C++ and .NET. Do you think that could be related to `System.Collections.BitArray` being so old? I tried with an array of bools but that was even slower, so I am not sure what could be a replacement – Kali User Apr 24 '21 at 22:40
  • 1
    Does it have to be specifically a `System.Collections.BitArray` (which you need to copy the data out of, or use Reflection to cheat, both of which have significant overhead), or any kind of bit array (eg `ulong[]` with some functions to treat it as a bit array) – harold Apr 24 '21 at 22:42
  • @nicomp It doesn't have to be a `BitArray`, `ulong[]` sounds like a good option to try – Kali User Apr 24 '21 at 22:44
  • @nicomp I have updated the question with your idea. This is the fastest version so far so feel free to create an answer and I will be happy to accept it – Kali User Apr 24 '21 at 23:03

1 Answers1

2

Under the restriction of having to use a BitArray, BitOperations.PopCount seems to be the best alternative:

open System
open System.Numerics

let limit = 500000000
let bitArray = Collections.BitArray(limit)
bitArray.[9999999] <- true

let ints = Array.create ((bitArray.Count >>> 5) + 1) 0u
bitArray.CopyTo(ints, 0)
ints
|> Array.sumBy BitOperations.PopCount
|> printfn "%d"

Running time: 0.20 real 0.15 user 0.05 sys

From there, the next step is to replace the BitArray with uint[] as @nicomp indicated.

Kali User
  • 105
  • 6