Considering the following test programs :
int main( void ) {
int iterations = 1000000000;
while ( iterations > 0 )
-- iterations;
}
Loop value on the stack (dereferenced)
int main( void ) {
int iterations = 1000000000;
int * p = & iterations;
while ( * p > 0 )
-- * p;
}
#include <stdlib.h>
int main( void ) {
int * p = malloc( sizeof( int ) );
* p = 1000000000;
while ( *p > 0 )
-- * p;
}
By compiling them with -O0, I get the following execution times :
case1.c
real 0m2.698s
user 0m2.690s
sys 0m0.003s
case2.c
real 0m2.574s
user 0m2.567s
sys 0m0.000s
case3.c
real 0m2.566s
user 0m2.560s
sys 0m0.000s
[edit] Following is the average on 10 executions :
case1.c
2.70364
case2.c
2.57091
case3.c
2.57000
Why is the execution time bigger with the first test case, which seems to be the simplest ?
My current architecture is a x86 virtual machine (Archlinux). I get these results both with gcc (4.8.0) and clang (3.3).
[edit 1] Generated assembler codes are almost identical except that the second and third ones have more instructions than the first one.
[edit 2] These performances are reproducible (on my system). Each execution will have the same order of magnitude.
[edit 3] I don't really care about performances of a non-optimized program, but I don't understand why it would be slower, and I'm curious.