0

Assume to have two 3D arrays: A and B, with different number of elements each. I do some operations with values of A that correspond to some values of B with different indices.

For example : I use A[i][j][k] to calculate some quantities. Since each calculation is independent, I can do this using parallel for with no problem. But the updated value are used to increase the values of some positions of B array.

For example : A[i][j][k]->C(a number)->B[l][m][n]. But at the same time a lot of writes could occur to B[l][m][n]. I use B[l][m][n]+=c to update B elements. Of course I cannot use OpenMP here because I violate the independence of loops. And neither do I know a priori the indices l,l,m in order to group them in buffer writes. Any good ideas on how to implement it in parallel? Maybe critical or atomic would benefit me here but I don't know how

A simplified version of my problem (1D)

for(int i=0,i<size_A,i++)
{
//Some code here to update A[i]. Each update is independent.
}

for(int j=0,j<size_A,j++)
{
//From A[j] B[m],B[m+1] are evaluated
int m=A[j]/dx;
double c;//Make some calculations
B[m] += c;
B[m+1] += c*c;
} 
dimpep
  • 63
  • 1
  • 6
  • 3
    please show the code. Code is better explained via code rather than by words. And please decide for one language – 463035818_is_not_an_ai Jan 23 '18 at 12:37
  • Code is about 2000 lines each function. What I would like to do is to concurrent write in the same array element using openmp – dimpep Jan 23 '18 at 13:31
  • 2
    Please read about, learn to appreciate, and provide a [mcve]. From your text it really is difficult to understand and just a couple lines of code that demonstrate the problem would help – 463035818_is_not_an_ai Jan 23 '18 at 13:34
  • btw imho the formating after the edit makes it even harder to follow the text. `A[i][j][k]->C(a number)->B[l][m][n]` is supposed to be an example for "But the updated value are used to increase the values of some positions of B array.", right? then the formatting is confusing – 463035818_is_not_an_ai Jan 23 '18 at 13:36
  • I cannot paste thousands of lines that could explain my problem. I will try to give a simplified version – dimpep Jan 23 '18 at 13:56

1 Answers1

0

There are two ways to do this. Either you make each update atomic. This just looks like:

#pragma omp atomic update
B[m] += c;

Or each threads gets a private update-version of B and after the loop, the private copies are all safely added together. This is possible by using:

#pragma omp parallel for reduction(+:B)
for (...)

The latter requires OpenMP 4.5, but there are workarounds for earlier versions. Which one is better for you depends on the rate of updates and size of the matrix. You have to measure both to be sure.

Zulan
  • 21,896
  • 6
  • 49
  • 109
  • And what if those updates of B array is in a separate function? Can I include #pragma omp atomic update in a function where there no #pragma omp parallel for? – dimpep Jan 24 '18 at 07:02
  • Yes, That's fine – Zulan Jan 24 '18 at 08:28
  • When I have pragma omp atomic in a expression like B[m]+=c, do the other threads wait and each thread execute this expression serial? I would like to know the behavior of other threads during atomic clause – dimpep Jan 24 '18 at 11:05
  • Atomics are usually better than locks. Most architectures have some sort of hardware/ISA support for atomic memory writes. That doesn't mean it's free when there is no contention. When in doubt: measure on your system! – Zulan Jan 24 '18 at 12:46