0

I have two vectors one of them is already filled with data and it has to be assigned to the first vector. But using the OpenMp parallel for pragma withou tusing locks gives me sometimes the right output and sometimes the wroing output. But I'm not sure why (I am new to OpenMP) and how can I:

#pragma omp parallel for shared(vec1,vec2) firstprivate(params)
for(int i=0;i<params.a;i++)
{
  int offset= i*params.b; // is omp private?
    for(int j=0;i<params.b;j++)
    { //if I use omp_locks here it works correctly
       vec1[j]+=vec2[offset+j];
    }
}
mbed_dev
  • 1,450
  • 16
  • 33
  • I have also thought about that, but the thing I don't understand is: I parallelize only the outer loop, so each thread performs the inner loop "serially". They read and write to different memory parts, so the threads should not affect each other ( I don't have one global sum variable). Or am I wrong? – mbed_dev Mar 31 '15 at 15:36
  • 2
    In your inner loop, all threads will attempt to modify `vec1[0]` right at the start. As `vec1` is shared, this will cause a data race. – wolfPack88 Mar 31 '15 at 15:39
  • Well, I didn't see that coming :) What would you recommend? Inner loop parallelization? or making a copy of vec1 for each thread(but the structure will be very big [FFT Data >2^17]). Parallelizing the outer loop and locking in the inner loop is propably unnecessary – mbed_dev Mar 31 '15 at 17:11
  • I would suggest a similar solution to what I mention in my suggested duplicate. Create a new array that is initially filled with `0`, and do your loop as you are doing it now except with `reduction(newArray)` as part of your `#pragma`. Then, in a separate loop, add the elements of `newArray` to `vec1`. You can parallelize the second loop as well. – wolfPack88 Mar 31 '15 at 17:45
  • Ok, I will try it again tomorrow. Many thanks for your help! – mbed_dev Mar 31 '15 at 20:45

0 Answers0