- The
taskloop
construct is a task-generating construct, not a worksharing-loop construct. You can read the following in OpenMP specification:
The taskloop construct is a task generating construct. When a thread
encounters a taskloop construct, the construct partitions the
iterations of the associated loops into explicit tasks for parallel
execution.
In your case 6 threads encounter the taskloop construct, therefore the loop is executed 6 times, and you got a bigger result than expected.
To correct it you have to i) make sure that only a single thread executes the tasks generating construct, or ii) use a worksharing-loop construct.
There is a race condition at line sum += n[i];
, as different threads update sum
simultaneously. To correct it the best option is to use reduction (reduction(+:sum)
)
In the second case the sharing attribute of sum
is private, so the variable sum
will be privatized. It means that a local variable is created, and only the value of the local variable is changed. It does not change the value of sum
in the enclosing context, therefore you got 0
.
To sum up, here are the 2 alternatives mentioned above to correct your code:
i) using single
construct and taskloop
:
#pragma omp parallel num_threads(NUMTHREAD)
#pragma omp single nowait
{
#pragma omp taskloop reduction(+:sum)
for (int i = 0; i < SIZE; i++) {
sum += n[i];
}
}
ii) using a worksharing-loop construct:
#pragma omp parallel num_threads(NUMTHREAD)
{
#pragma omp for reduction(+:sum)
for (int i = 0; i < SIZE; i++) {
sum += n[i];
}
}
Some minor, off-topic comments:
- The array (allocated by
new
operator) is never freed, you should use delete
to deallocate it. As pointed out by @VictorEijkhout it is even better if you do not use new
at all. Use vector
or array
from the std library.
- You should prefer
const/constexpr int SIZE=100;
over #define SIZE 100
. More details can be found e.g. here.
- You should consider reading Why is "using namespace std;" considered bad practice?