I used _rdtsc()
to time atoi()
and atof()
and I noticed they were taking pretty long. I therefore wrote my own versions of these functions which were much quicker from the first call.
I am using Windows 7, VS2012 IDE but with the Intel C/C++ compiler v13. I have -/O3 enabled and also -/Ot ("favour fast code"). My CPU is an Ivy Bridge (mobile).
Upon further investigation, it seemed that the more times atoi()
and atof()
were called, the quicker they executed?? I am talking magnitudes faster:
When I call atoi()
from outside my loop, just the once, it takes 5,892 CPU cycles but after thousands of iterations this reduced to 300 - 600 CPU cycles (quite a large execution time range).
atof()
initially takes 20,000 to 30,000 CPU cycles and then later on after a few thousand iterations it was taking 18 - 28 CPU cycles (which is the speed at which my custom function takes the first time it is called).
Could someone please explain this effect?
EDIT: forgot to say- the basic setup of my program was a loop parsing bytes from a file. Inside the loop I obviously use my atof and atoi to notice the above. However, what I also noticed is that when I did my investigation before the loop, just calling atoi and atof twice, along with my user-written equivalent functions twice, it seemed to make the loop execute faster. The loop processed 150,000 lines of data, each line requiring 3x atof()
or atoi()
s. Once again, I cannot understand why calling these functions before my main loop affected the speed of a program calling these functions 500,000 times?!
#include <ia32intrin.h>
int main(){
//call myatoi() and time it
//call atoi() and time it
//call myatoi() and time it
//call atoi() and time it
char* bytes2 = "45632";
_int64 start2 = _rdtsc();
unsigned int a2 = atoi(bytes2);
_int64 finish2 = _rdtsc();
cout << (finish2 - start2) << " CPU cycles for atoi()" << endl;
//call myatof() and time it
//call atof() and time it
//call myatof() and time it
//call atof() and time it
//Iterate through 150,000 lines, each line about 25 characters.
//The below executes slower if the above debugging is NOT done.
while(i < file_size){
//Loop through my data, call atoi() or atof() 1 or 2 times per line
switch(bytes[i]){
case ' ':
//I have an array of shorts which records the distance from the beginning
//of the line to each of the tokens in the line. In the below switch
//statement offset_to_price and offset_to_qty refer to this array.
case '\n':
switch(message_type){
case 'A':
char* temp = bytes + offset_to_price;
_int64 start = _rdtsc();
price = atof(temp);
_int64 finish = _rdtsc();
cout << (finish - start) << " CPU cycles" << endl;
//Other processing with the tokens
break;
case 'R':
//Get the 4th line token using atoi() as above
char* temp = bytes + offset_to_qty;
_int64 start = _rdtsc();
price = atoi(temp);
_int64 finish = _rdtsc();
cout << (finish - start) << " CPU cycles" << endl;
//Other processing with the tokens
break;
}
break;
}
}
}
The lines in the file are like this (with no blank lines in between):
34605792 R dacb 100
34605794 A racb S 44.17 100
34605797 R kacb 100
34605799 A sacb S 44.18 100
34605800 R nacb 100
34605800 A tacb B 44.16 100
34605801 R gacb 100
I am using atoi()
on the 4th element in the 'R' messages and 5th element in 'A' messages and using atof()
on the 4th element in the 'A' messages.