This might be a repeat question. So, please feel free to flag it down if you want.
In C++, I learnt that array dimensions are stored consecutively in memory How are 3D arrays stored in C? so i did a little experiment to assign natural numbers to a matrix of size 1600000000x1 and 1x1600000000 (please change matsize
in the code to smaller value depending on your memory). The code below assigns a natural numbers from 1 to 1600000000 to the matrix a
(whose dimensions are 1x1600000000) and computes sum of cubes of all the elements. The opposite case is just reversing the dimensions of matrix which i do by changing xdim
to matsize
and ydim
to 1, and recompiling the code and running it again. The matrix is [xdim][ydim]
#include <iostream>
#include <time.h>
using namespace std;
int main()
{
long int matsize, i, j, xdim, ydim;
long double ss;
double** a;
double time1, time2, time3;
clock_t starttime = clock();
matsize=1600000000;
xdim=1;
ydim=matsize;
ss=0.0;
a= new double *[xdim];
for(i=0;i<xdim;i++)
{
a[i]= new double[ydim];
}
time1= (double)( clock() - starttime ) / (double)CLOCKS_PER_SEC;
cout << "allocated. time taken for allocation was " << time1 <<" seconds. computation started" << endl;
for(i=0;i<xdim;i++)
{
for(j=0;j<ydim;j++)
{
a[i][j]=(i+1)*(j+1);
ss=ss+a[i][j]*a[i][j]*a[i][j];
}
}
cout << "last number is " << a[xdim-1][ydim-1] << " . sum is " << ss << endl;
time2= ((double)( clock() - starttime ) / (double)CLOCKS_PER_SEC) - time1;
cout << "computation done. time taken for computation was " << time2 << " seconds" << endl;
for(i=0;i<xdim;i++)
{
delete [] a[i];
}
delete [] a;
time3= ((double)( clock() - starttime ) / (double)CLOCKS_PER_SEC) - time2;
cout << "deallocated. time taken for deallocation was " << time3 << " seconds" << endl;
cout << "the total time taken is " << (double)( clock() - starttime ) / (double)CLOCKS_PER_SEC << endl;
cout << "or " << time1+time2+time3 << " seconds" << endl;
return 0;
}
My results for the two cases are -
Case 1: xdim=1 and ydim=1600000000
allocated. time taken for allocation was 4.5e-05 seconds. computation started last number is 1.6e+09 . sum is 1.6384e+36 computation done. time taken for computation was 14.7475 seconds deallocated. time taken for deallocation was 0.875754 seconds the total time taken is 15.6233 or 15.6233 seconds
Case 2: xdim=1600000000 and ydim=1
allocated. time taken for allocation was 56.1583 seconds. computation started last number is 1.6e+09 . sum is 1.6384e+36 computation done. time taken for computation was 50.7347 seconds deallocated. time taken for deallocation was 270.038 seconds the total time taken is 320.773 or 376.931 seconds
The output sum is same in both cases. I can understand that time taken for allocation and deallocation of memory is different in both cases, but why is the computation time so much different too if memory allocation is continuous ? What is wrong in this code ?
If it matters, i am using g++ on Mountain Lion and compile using g++ -std=c++11, i7 quad core processor, 16 GB RAM