Note: the alleged duplicate question is, I think, mostly related to "<" and ">" comparison, but not "==" comparison and hence does not answer my question about performance of the "==" operator.
For a long time, I have believed that "processing" a sorted array should be faster than an unsorted array. At first, I thought using "==" in a sorted array should be faster than in unsorted array because - I guess - of how branch prediction works:
UNSORTEDARRAY:
5 == 100 F
43 == 100 F
100 == 100 T
250 == 100 F
6 == 100 F
(other elements to check)
SORTEDARRAY:
5 == 100 F
6 == 100 F
43 == 100 F
100 == 100 T
(no need to check other elements, so all are F)
so I guess SORTEDARRAY should be faster than UNSORTEDARRAY, but today I used the code to generate 2 arrays in a header to test, and branch prediction seemed not to work as I thought.
I generated an unsorted array and a sorted array to test:
srand(time(NULL));
int UNSORTEDARRAY[524288];
int SORTEDARRAY[sizeof(UNSORTEDARRAY)/sizeof(int)];
for(int i=0;i<sizeof(SORTEDARRAY)/sizeof(int);i++){
SORTEDARRAY[i]=UNSORTEDARRAY[i]=rand();
}
sort(SORTEDARRAY,SORTEDARRAY+sizeof(SORTEDARRAY)/sizeof(int));
string u="const int UNSORTEDARRAY[]={";
string s="const int SORTEDARRAY[]={";
for(int i=0;i<sizeof(UNSORTEDARRAY)/sizeof(int);i++){
u+=to_string(UNSORTEDARRAY[i])+",";
s+=to_string(SORTEDARRAY[i])+",";
}
u.erase(u.end()-1);
s.erase(s.end()-1);
u+="};\n";
s+="};\n";
ofstream out("number.h");
string code=u+s;
out << code;
out.close();
so to test, just count if the value is == RAND_MAX/2 like this:
#include "number.h"
int main(){
int count;
clock_t start = clock();
for(int i=0;i<sizeof(SORTEDARRAY)/sizeof(int);i++){
if(SORTEDARRAY[i]==RAND_MAX/2){
count++;
}
}
printf("%f\n",(float)(clock()-start)/CLOCKS_PER_SEC);
}
run 3 times:
UNSORTEDARRAY
0.005376
0.005239
0.005220
SORTEDARRAY
0.005334
0.005120
0.005223
it seems a small performance difference, so I didn't believe it and then tried to change "SORTEDARRAY[i]==RAND_MAX/2" to "SORTEDARRAY[i]>RAND_MAX/2" to see if it made a difference:
UNSORTEDARRAY
0.008407
0.008363
0.008606
SORTEDARRAY
0.005306
0.005227
0.005146
this time there is a great difference.
Is "==" in the sorted array not faster than an unsorted array? If yes, why is ">" in sorted array faster than an unsorted array but "==" is not?