I have written a piece of code wherein a data:
unsigned char buf[4096]; // data in chunks of size 4k
unsigned counter[256];
I am adding up the i/p data for every 3 contiguous bytes and storing the ans. ex: temp[4096]; temp[0] = buf[0] + buf[1] + buf[2]; ... till 4096
Then a histogram is generated from the results of temp using the code:
for(i = 0; i < 4096; i++)
counter[temp[i]]++;
The histogram is sorted (bubble sort) and then top 8 most recurring values are taken. The code is run in the linux kernel (2.6.35)
The problem I am facing is that if I remove the sorting part, the time taken to execute the code is very fast (6 microsec on my laptop, measured using gettimeofday func). But after introducing the sorting, the process slows down to a great extent (44 microsec). The sorting function itself takes 20 microsecs, I cant understand why is the time then increasing so much. I did a memory analysis using cachegrind, the results are normal and I even tried disabling preemption ubut still it doesnt show any difference. If anybody can help me out over here. Thanks!