I want to add multi-threading to a program so that I can speed up a task by running two or more concurrent processes to work through a loop.
An outline of the code (without multi threading) is
double * a, *b, *c, *d;
int n=10000;
int i, j;
a=(doube *)calloc(n, sizeof(double));
b=(doube *)calloc(n, sizeof(double));
c=(doube *)calloc(n, sizeof(double));
d=(doube *)calloc(n, sizeof(double));
setup(a, b); //routine to set the inital values of arrays a and b
for (i=0; i<n; i++)
{
for (j=0; j<n; j++)
{
if (i==j) continue;
c[i]+=func(a[i],a[j]); //calculation functions
d[j]+=func2(b[i],b[j]);
}
}
So my plan is to use the numbers in the a
and b
arrays to calculate values in c
and d
.
I want to multithread the loop at the end of the code fragment above without running into speed issues due to memory access. - I want to split the loop so that different threads each run over the part of the range of i
.
I can see three possible ways of proceeding here.
[1] single arrays for a,b,c,d
- but potential conflicts due to two threads trying to write to the same number at the same time.
[2] single arrays for a,b
but multiple arrays for c, d
are created so that there is one copy per thread. Thus threads will all be reading from the same arrays a,b
but they will be writing different arrays c,d
to avoid possible collisions - these multiple arrays for c,d
would be then combined together after all the threads are finished
[3] multiple arrays for a,b,c,d
- a copy of each array is made for each different thread so there are no read or write 'collisions' and again the multiple arrays for c,d
would be then combined together after all the threads are finished
I expect the answer is not [1] but would really appreciate suggestions on whether option [2] or [3] would be better. [3] may be best, but has overheads of copying the input data of a,b
for each thread.
Note that I have searched for similar questions and found some useful things (e.g. Memory considerations with multithreading), but I have not found a clear answer to this question.