Use the #pragma omp parallel for reduction(+:a)
clause before the for loop
variable declared within the for
loop are local, as well as loop counters
variable declared outside the #pragma omp parallel
block are shared by default, unless otherwise specified (see shared
, private
, firstprivate
clauses). Care should be taken when updating shared variables as a race condition may occur.
In this case, the reduction(+:a)
clause indicated that a
is a shared variable on which an addition is performed at each loop. Threads will automatically keep track of the total amount to be added and safely increment a at the end of the loop.
Both codes below are equivalent:
float a = 0.0f;
int n=1000;
#pragma omp parallel shared(a) //spawn the threads
{
float acc=0; // local accumulator to each thread
#pragma omp for // iterations will be shared among the threads
for (int i = 0; i < n; i++){
float x = algorithm(i); //do something
acc += x; //local accumulator increment
} //for
#omp pragma atomic
a+=acc; //atomic global accumulator increment: done on thread at a time
} //end parallel region, back to a single thread
cout << a;
Is equivalent to:
float a = 0.0f;
int n=1000;
#pragma omp parallel for reduction(+:a)
for (int i = 0; i < n; i++){
int x = algorithm(i);
a += x;
} //parallel for
cout << a;
Note that you can't make a for loop with a stop condition i<x
where x
is a local variable defined within the loop.