0

I wanted to test which one is faster for my problem: boolean[] or BitSet. I need to update a seperate matrix for every pair of indices which are true. e.g. [true, true, false, true, false] results in [0][1]++, [0][3]++ and [1][3]++.

Using boolean[]:

for (int i = 0; i < (booleanArray.length - 1); i++) {
  if (booleanArray[i]) {
    for (int j = i + 1; j < booleanArray.length; j++) {
      if (booleanArray[j]) {
        matrix[i][j]++;
      }
    }
  }
}

Using BitSet:

for (int i = bitSet.nextSetBit(0); i != -1; i = bitSet.nextSetBit(i + 1)) { 
  for (int j = bitSet.nextSetBit(i + 1); j != -1; j = bitSet.nextSetBit(j + 1)) {
    matrix[i][j]++;
  }
}

Iterating through both showed that boolean[] seems to be faster (1.5 vs 3 sec), but updating the matrix needs so much more time (20 sec vs. 6 sec!). How is this possible? How can I improve the performance using boolean[]?

Olivier Grégoire
  • 33,839
  • 23
  • 96
  • 137
  • 1
    "1.5 vs 3 sec" is insignificant imho. See e.g. https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java on how to get proper test results. – luk2302 Feb 28 '18 at 09:00
  • `BitSet` will be *insignificantly* slower compared to `boolean[]` since you have some overhead calculating word index and growing the `BitSet`. I seriously doubt your measurements, it is not so easy to write a correct micro-benchmark (see [the link](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) given in another comment). – lexicore Feb 28 '18 at 09:05
  • Thanks for the link :) I guess the problem isn't the iterating. I am just wondering why the updating of the matrix takes so much time compared while using boolean[] :( – Isabell W. Feb 28 '18 at 09:11
  • the actual matrix updating itself takes the same time in both cases since you perform the same amount of `matrix[i][j]++` operations - or your measurements are off. Your code might even perform VERY differently depending on the sparseness of the bitSet / booleanArray, see https://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array . Measuring the time your code takes and *why* it behaves a given way simply is not as easy as it sounds or looks. – luk2302 Feb 28 '18 at 09:18
  • @IsabellW. I am absolutely not persuaded that it's updating the matrix which takes long. – lexicore Feb 28 '18 at 09:27
  • @luk2302 Yes, that is why I am so confused: it should not make a difference when doing the same amount of the same code but it does. And 20 sec to 6 sec is such a huge difference that you will notice even without any meassurings. And yes maybe it is just not that easy to know why the code behaves like this... – Isabell W. Feb 28 '18 at 09:31
  • @luk2302 "you perform the same amount of matrix[i][j]++ operations" - the difference is that with the array the `if` block is always executed (`(n^2-1)/2` times, and with `BitSet` the `matrix[i][j]++` is executed exactly as many times as there are pairs (which might be a much smaller number). Not sure if this is significant, though, just wanted to point that the code is not exactly equivalent. – lexicore Feb 28 '18 at 09:32
  • 2
    By the way, I disagree that this is the duplicate of https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java. Yes, the benchmark must be done correctly, but the question is about `boolean[]` vs. `BitSet`. Correct benchmark will not explain differences in performance. – lexicore Feb 28 '18 at 09:35
  • @lexicore It is a "huge" 17700x17700 matrix :) If i comment the line of the updating out and do some other stuff in the loops (like counting the iterations for example or do some other calculations) it rans very fast (1.5/3 sec). Normally I do the update of the matrix in both directions [i][j] and [j][i] and this needs so much more time. That is why I am guessing its the bottleneck of this part. – Isabell W. Feb 28 '18 at 09:40
  • As an aside: updating `matrix[i][j]` will probably be fast due to [cache locality](https://stackoverflow.com/questions/12065774/why-does-cache-locality-matter-for-array-performance) but conversely, updating `matrix[j][i]` will be slow. – Klitos Kyriacou Feb 28 '18 at 09:54

0 Answers0