Long time ago, inspired by "Numerical recipes in C", I started to use the following construct for storing matrices (2D-arrays).
double **allocate_matrix(int NumRows, int NumCol)
{
double **x;
int i;
x = (double **)malloc(NumRows * sizeof(double *));
for (i = 0; i < NumRows; ++i) x[i] = (double *)calloc(NumCol, sizeof(double));
return x;
}
double **x = allocate_matrix(1000,2000);
x[m][n] = ...;
But recently noticed that many people implement matrices as follows
double *x = (double *)malloc(NumRows * NumCols * sizeof(double));
x[NumCol * m + n] = ...;
From the locality point of view the second method seems perfect, but has awful readability... So I started to wonder, is my first method with storing auxiliary array or **double
pointers really bad or the compiler will optimize it eventually such that it will be more or less equivalent in performance to the second method? I am suspicious because I think that in the first method two jumps are made when accessing the value, x[m]
and then x[m][n]
and there is a chance that each time the CPU will load first the x
array and then x[m]
array.
p.s. do not worry about extra memory for storing **double
, for large matrices it is just a small percentage.
P.P.S. since many people did not understand my question very well, I will try to re-shape it: do I understand right that the first method is kind of locality-hell, when each time x[m][n]
is accessed first x
array will be loaded into CPU cache and then x[m]
array will be loaded thus making each access at the speed of talking to RAM. Or am I wrong and the first method is also OK from data-locality point of view?