I was presented example of a loop which should be slower than the one after this:
for (i = 0; i < 1000; i++)
column_sum[i] = 0.0;
for (j = 0; j < 1000; j++)
column_sum[i] += b[j][i];
Comparing to this one:
for (i = 0; i < 1000; i++)
column_sum[i] = 0.0;
for (j = 0; j < 1000; j++)
for (i = 0; i < 1000; i++)
column_sum[i] += b[j][i];
Now, I coded a tool to test number of different index numbers, but I am not seeing much of performance advantage there after I tried this concept, and I'm afraid that my code has something to do with it...
Should be slower loop that works within my code:
for (i = 0; i < val; i++){
column_sum[i] = 0.0;
for (j = 0; j < val; j++){
int index = i * (int)val + j;
column_sum[i] += p[index];
}
}
Should be "significantly" faster code:
for (i = 0; i < val; i++) {
column_sum[i] = 0.0;
}
for (j = 0; j < val; j++) {
for (i = 0; i < val; i++) {
int index = j * (int)val + i;
column_sum[i] += p[index];
}
}
Data comparison: