I have n local copies of matrices,say 'local', in n threads. I want to update a global shared matrix 's' with its elements being sum of corresponding elements of all local matrices. For eg. s[0][0] = local_1[0][0] + local_2[0][0]+...+local_n[0][0].
I wrote the following loop to achieve it -
#pragma omp parallel for
for(int i=0;i<rows;i++)
{
for(int j=0;j<cols;j++)
s[i][j]=s[i][j]+local[i][j];
}
This doesn't seem to work. Could someone kindly point out where am I going wrong?
Updated with example -
Suppose there are 3 threads, with following local matrices -
thread 1 local = 1 2 3 4 thread 2 local = 5 6 7 8 thread 3 local = 1 0 0 1 shared matrix would then be s = 7 8 10 13