Is it possible to do a reduction on an array with openmp?

Question

Does OpenMP natively support reduction of a variable that represents an array?

This would work something like the following...

float* a = (float*) calloc(4*sizeof(float));
omp_set_num_threads(13);
#pragma omp parallel reduction(+:a)
for(i=0;i<4;i++){
   a[i] += 1;  // Thread-local copy of a incremented by something interesting
}
// a now contains [13 13 13 13]

Ideally, there would be something similar for an omp parallel for, and if you have a large enough number of threads for it to make sense, the accumulation would happen via binary tree.

May be you could explain a bit more what you want to do exactly. Providing serial code might help. — FFox, Sep 27 '10 at 05:49
Digging around a bit more, it sounds like "only in fortran" is the answer. I ended up just allocating a single large array of local copies outside of the loop, letting the threads accumulate to their own copies within the for loop, then accumulating into a global array after the for loop, still inside the parallel region, inside of a critical section. — Andrew Wagner, Sep 27 '10 at 19:52
Digging even more, here is a research paper on something similar, but it's not in openmp yet. http://www.springerlink.com/content/tq76655852630525/ — Andrew Wagner, Oct 01 '10 at 14:05
You can probably use atomic rather than critical to guard the individual adds (or even an array of locks) if you want to reduce the overhead; you could even use an array of shared arrays rather than private arrays and try to roll your own binary reduction. But it'll be ugly. — Jonathan Dursi, Oct 22 '10 at 12:00
I ended up manually allocating space for thread-local copies of the arrays. Each thread does 1/8 of the accumulation into its local copy, and then the threads accumulate their local copy into a global copy inside of a #pragma omp critical block. Since the number of cores (8) is much smaller than n, the synchronization overhead is negligible. It ain't pretty, but it works. — Andrew Wagner, Oct 24 '10 at 17:21
using OpenMP with C++ cannot be recommended: OpenMP does not support recent C++ standards. With C++ you may either want to use `std::thread`s etc, or [tbb](https://www.threadingbuildingblocks.org/) — Walter, May 13 '16 at 21:30

score 9 · Answer 1 · edited May 02 '20 at 18:58

Array reduction is now possible with OpenMP 4.5 for C and C++. Here's an example:

#include <iostream>

int main()
{

  int myArray[6] = {};

  #pragma omp parallel for reduction(+:myArray[:6])
  for (int i=0; i<50; ++i)
  {
    double a = 2.0; // Or something non-trivial justifying the parallelism...
    for (int n = 0; n<6; ++n)
    {
      myArray[n] += a;
    }
  }
  // Print the array elements to see them summed   
  for (int n = 0; n<6; ++n)
  {
    std::cout << myArray[n] << " " << std::endl;
  } 
}

Outputs:

I compiled this with GCC 6.2. You can see which common compiler versions support the OpenMP 4.5 features here: https://www.openmp.org/resources/openmp-compilers-tools/

Note from the comments above that while this is convenient syntax, it may invoke a lot of overheads from creating copies of each array section for each thread.

it irks me a bit that your `int main()` doesn't return an 'int' :D — RL-S, Nov 05 '21 at 13:23

score 3 · Accepted Answer · answered Oct 24 '10 at 17:37

3

Only in Fortran in OpenMP 3.0, and probably only with certain compilers.

See the last example (Example 3) on:

http://wikis.sun.com/display/openmp/Fortran+Allocatable+Arrays

answered Oct 24 '10 at 17:37

Andrew Wagner

22,677
21
86
100

3

It is now possible since OpenMP 4.5; see the answer of Chen Jiang below. Basically, you must specify _array sections_ (see Section 2.4, p. 44 of OpenMP 4.5 spec.). Your #pragma specification would look like this: `#pragma omp parallel reduction(+:a[:4])` Be careful with this however, you have to realize that each thread will allocate its own version of the array section; if you do this on large arrays with many threads, you might make your memory need explode. – Hugo Raguet Jun 02 '16 at 14:55

score 2 · Answer 3 · answered May 13 '16 at 21:01

Now the latest openMP 4.5 spec has supports of reduction of C/C++ arrays. http://openmp.org/wp/2015/11/openmp-45-specs-released/

And latest GCC 6.1 also has supported this feature. http://openmp.org/wp/2016/05/gcc-61-released-supports-openmp-45/

But I didn't give it a try yet. Wish others can test this feature.

score 1 · Answer 4 · edited May 23 '17 at 12:09

1

OpenMP cannot perform reductions on array or structure type variables (see restrictions).

You also might want to read up on private and shared clauses. private declares a variable to be private to each thread, where as shared declares a variable to be shared among all threads. I also found the answer to this question very useful with regards to OpenMP and arrays.

edited May 23 '17 at 12:09

Community

1
1

answered Oct 22 '10 at 01:38

Garrett Hyde

5,409
8
49
55

score 0 · Answer 5 · answered Jun 14 '17 at 07:20

OpenMP can perform this operation as of OpenMP 4.5 and GCC 6.3 (and possibly lower) supports it. An example program looks as follows:

#include <vector>
#include <iostream>

int main(){
  std::vector<int> vec;

  #pragma omp declare reduction (merge : std::vector<int> : omp_out.insert(omp_out.end(), omp_in.begin(), omp_in.end()))

  #pragma omp parallel for default(none) schedule(static) reduction(merge: vec)
  for(int i=0;i<100;i++)
    vec.push_back(i);

  for(const auto x: vec)
    std::cout<<x<<"\n";

  return 0;
}

Note that omp_out and omp_in are special variables and that the type of the declare reduction must match the vector you are planning to reduce on.

Is it possible to do a reduction on an array with openmp?

5 Answers5

Linked