-4

I am in need of an explanation for something I have discovered by experience. I have a very large flat array of type char. The array is in total 500x500x500 = 125E+6 bytes long. Inside the cells I keep a number between 0 and 255. But luckily when traversing the array I am only interested in cells having a non-zero value!

Now here goes the question. I found out by experimenting that doing even the smallest operation on the cells takes a huge amount of time when going through the whole zero and non-zero array whereas, if I use a condition similar to the one below,

while( index < 125000000 )
{
    if( array[ index ] > 0 )
    {
        // Do some stuff
    }

    index++;
}

The execution time is drastically shorter. In fact I can go through the whole array and perform my operations on the non-zero cells in a few seconds rather than the half an hour execution time of the approach with no conditions.

What I need is an explanation of why this works! I need to explain this phenomena in my thesis report and it would be the best if I can relate to a scientific paper or similar.

Thank you in advance!

Best Regards, Omid Ariyan

Omid Ariyan
  • 1,164
  • 13
  • 19
  • 2
    If you only have a few non-zero values in the array, then there are very few operations the computer need to do with your check. If you don't do the check then the computer have to do your (possible expensive) operations on all values. – Some programmer dude Aug 28 '13 at 11:04
  • What is there to explain? If there is a relatively small number of nonzero elements then the code with the condition does only a small fraction of the work. – Jon Aug 28 '13 at 11:04
  • How exactly does the code that takes long look like? – jrok Aug 28 '13 at 11:04
  • What fraction of elements have values `<=0`? – juanchopanza Aug 28 '13 at 11:04
  • can you provide a little bit more code? E.g. what data type is index, what is in the array, what is "do some stuff". How many elements are non zero? – ogni42 Aug 28 '13 at 11:05
  • 1
    i believe you need to read this - http://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-an-unsorted-array – No Idea For Name Aug 28 '13 at 11:05
  • 2
    @NoIdeaForName Is that relevant here? The issue here seems much simpler - doing a fraction of the work takes a fraction of the time. – BoBTFish Aug 28 '13 at 11:07
  • Could it be that you have a signed char type, so more numbers than you expect are less than or equal to `0`? – juanchopanza Aug 28 '13 at 11:07
  • Without knowing what "Do some stuff" actually does, it's hard to say if this is realistic or not. Also, is `array` a normal array, or a vector? – Mats Petersson Aug 28 '13 at 11:09

1 Answers1

2

It could be that you expect your char to be unsigned, hence capable of holding values in the range [0,255], but it is in fact signed, holding values in the range [-128, 127] (assuming two's complement). So the number of cases where array[ index ] > 0 is much smaller than you expect, because all elements assigned values larger than 127 will have a negative value.

Note that you claim to check for non-zero values, but you are really checking for positive ones.

You can check the range of char on your platform:

#include <limits>
#include <iostream>

int main()
{
  std::cout << static_cast<int>(std::numeric_limits<char>::min()) << std::endl;
  std::cout << static_cast<int>(std::numeric_limits<char>::max()) << std::endl;

  char c = 234;
  std::cout << static_cast<int>(c) << std::endl; // 234 if unsigned, -22 if signed
}
juanchopanza
  • 223,364
  • 34
  • 402
  • 480
  • Forgive me, I meant my array is of type "unsigned char". I've been using exclusively unsigned char for so long that in my mind "char" has lost its glory. – Omid Ariyan Aug 28 '13 at 11:36