I have searched and used many approaches for measuring the elapsed time. there are many questions for this purpose. For example, this question is very good but when you need an accurate time recorder I couldn't find a good method. For this, I want to share my method here to be used and be corrected if something is wrong.
UPDATE&NOTE: this question is for Benchmarking, less than one nanosecond. It's completely different from using clock_gettime(CLOCK_MONOTONIC,&start);
it records time more than one nanosecond.
UPDATE : A common method to measure the speedup is repeating a section of the program which should be benchmarked. But, as mentioned in comment it might show different optimization when the researcher rely on autovectorizing.
NOTE It's not accurate enough to measure the elapsed time in one repeatinng. In some cases my results show that the section must be repeated more than 1K or 1M to get the smallest time.
SUGGESTION : I'm not familiar with shell programming (just know some basic commands...) But, it might be possible to measure the smallest time with out repeating inside the program.
MY CURRENT SOLUTION In order to prevent the branches I repeat the ode section using a macro #define REP_CODE(X) X X X... X X
which X is the code section I want to benchmark as follows:
//numbers
#define FMAX1 MAX1*MAX1
#define COEFF 8
int __attribute__(( aligned(32))) input[FMAX1+COEFF]; //= {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17};
int __attribute__(( aligned(32))) output[FMAX1];
int __attribute__(( aligned(32))) coeff[COEFF] = {1,2,3,4,5,6,7,8};//= {1,1,1,1,1,1,1,1};//; //= {1,2,1,2,1,2,1,2,2,1};
int main()
{
REP_CODE(
t1_rdtsc=_rdtsc();
//Code
for(i = 0; i < FMAX1; i++){
for(j = 0; j < COEFF; j++){//IACA_START
output[i] += coeff[j] * input[i+j];
}//IACA_END
}
t2_rdtsc=_rdtsc();
ttotal_rdtsc[ii++]=t2_rdtsc-t1_rdtsc;
)
// The smallest element in `ttotal_rdtsc` is the answer
}
This does not impact the optimization but also is restricted by code size and compiling time is too much in some cases.
Any suggestion and correction?
Thanks in advance.