I tried a test case which compares the three options - global, local, local static
for about 20 million ops of a simple vector inner product for 4d vectors. This was done on VS2010 32-bit release version. Here's the result:
DPSUM:600000000 TIME:78| DPSUM:600000000 TIME:62| DPSUM:600000000
TIME:63| DPSUM:600000000 TIME:47| DPSUM:600000000 TIME:46|
DPSUM:600000000 TIME:78| DPSUM:600000000 TIME:47| DPSUM:600000000
TIME:47| DPSUM:600000000 TIME:78| DPSUM:600000000 TIME:47|
DPSUM:600000000 TIME:47| DPSUM:600000000 TIME:62| DPSUM:600000000
TIME:62| DPSUM:600000000 TIME:47| DPSUM:600000000 TIME:63|
DPSUM:600000000 TIME:46| DPSUM:600000000 TIME:63| DPSUM:600000000
TIME:62| DPSUM:600000000 TIME:47| DPSUM:600000000 TIME:47|
DPSUM:600000000 TIME:78| DPSUM:600000000 TIME:47| DPSUM:600000000
TIME:46| DPSUM:600000000 TIME:78| DPSUM:600000000 TIME:47|
DPSUM:600000000 TIME:47| DPSUM:600000000 TIME:62| DPSUM:600000000
TIME:63| DPSUM:600000000 TIME:47| DPSUM:600000000 TIME:62|
The first column is the static const
, second is local
and the third is global
. I'm posting the sample code if you want to try on your platform. Looks like static local
and local
are equally fast - at least for this compiler (maybe due to some internal optimization.
Code below:
#include <stdio.h>
#include <windows.h>
int ag[] = {1,2,3,4}; int bg[] = {1,2,3,4};
int dp1(){
static const int a[] = {1,2,3,4}; static const int b[] = {1,2,3,4};
return a[0]*b[0] + a[1]*b[1] + a[2]*b[2] + a[3]*b[3];
}
int dp2(){
int a[] = {1,2,3,4}; int b[] = {1,2,3,4};
return a[0]*b[0] + a[1]*b[1] + a[2]*b[2] + a[3]*b[3];
}
int dp3(){
return ag[0]*bg[0] + ag[1]*bg[1] + ag[2]*bg[2] + ag[3]*bg[3];
}
int main(){
int numtrials = 10;
typedef int (*DP)();
DP dps[] = {dp1, dp2, dp3};
for (int t = 0; t < numtrials; ++t){
int dpsum[] = {0,0,0};
for (int jj =0; jj <3; ++jj){
DWORD bef, aft;
bef = GetTickCount();
for (int ii =0; ii< 20000000; ++ii){
dpsum[jj] += dps[jj]();
}
aft = GetTickCount();
printf("DPSUM:%d TIME:%d| ", dpsum[jj], aft - bef);
}
printf("\n");
}
getchar();
}