5

I have array a of size N with random numbers. Using OpenMP I want to increment elements 0 to 9 of array b of size 10 for every number in A. The language is C.

#pragma omp parallel for
for(i = 0; i < N; i++)
   b[a[i]]++;

Unfortunately there are apparently simultanous writes in some elements of b and the result is not as expected. I tried it with setting b to firstprivate and lastprivate but this didn't help either.

The task seems simple but I don't know how to do it as there is no atomic for arrays in OpenMP. I could create a new array for the number of threads and then add them together in the end but that doesn't seem optimal.

Which would be the fastest way to count the occurences of the numbers in a in the elements of array b?

Michael
  • 51
  • 3
  • 2
    Sum independently and then merge the results. – Brian Cain Mar 26 '15 at 17:32
  • 1
    @BrianCain Im not sure what you mean exactly. With 'sum' do you mean the increment? With 'independantly' do you mean I should create a new private variable? With merge do you mean I should at the end add up all versions of the private variable? Because that is what seems inefficient to me. Could you show me with a simple code fragment what you mean? – Michael Mar 26 '15 at 19:13
  • The algorithm is not as simple as I'd assumed. But ultimately it's a tradeoff and whether it works likely depends on the ratio of N to `b`'s size (is it really always 10?). A simpler alternative is to use a series of mutexes. – Brian Cain Mar 26 '15 at 20:34
  • I do not understand "there is no `atomic` for arrays in OpenMP". To which OpenMP spec are you referring? And you are not trying to do atomics on an array. You are trying to atomically increment a single element of an array, and I cannot find any evidence that this would not work. – Jeff Hammond Mar 26 '15 at 23:30

3 Answers3

2

Your question is essentially a duplicate of a question I asked fill-histograms-in-parallel-with-openmp-without-using-a-critical-section.

The easy solution in your case is

#pragma omp parallel
{
    int i, b_local[10] = {0};
    #pragma omp for nowait 
    for(i = 0; i < n; i++) b_local[a[i]]++;
    #pragma omp critical
    for(i=0; i<10; i++) b[i] += b_local[i];    
}

It's possible to do this without a critical section (see my question) but it's not necessarily more efficient.

Here's a working example

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define N 100

void foo(int *b, int *a, int n) {
    #pragma omp parallel
    {
        int i, b_local[10];
        memset(b_local, 0, 10*sizeof(int));
        #pragma omp for 
        for(i = 0; i < n; i++) b_local[a[i]]++;


        #pragma omp critical
        {     
            for(i=0; i<10; i++) {
                b[i] += b_local[i]; 
            }
        }

    }
}

int main() {   
    int i;
    int b[10] = {0,1,2,3,4,5,6,7,8,9};
    int b2[10] = {0,1,2,3,4,5,6,7,8,9};
    int a[N];
    for(i=0; i<N; i++) a[i] = rand()%10;

    foo(b,a,N);
    for(i=0; i<N; i++) b2[a[i]]++;
    for(i=0; i<10; i++) printf("%d ", b[i]); puts("");
    for(i=0; i<10; i++) printf("%d ", b2[i]); puts("");
}
Z boson
  • 32,619
  • 11
  • 123
  • 226
0

If any of the values in a[] are identical then you would be writing to the same element of b simultaneously.

a[0] = 1 and a[1] = 1 then you would be writing to b[1] at the same time.

shredder
  • 11
  • 3
0

You can use 2 "for()" one for each array