7

I am testing OpenMP min reduction. If I write my code like the following, it will return the correct result: res = 3.

#include <omp.h>
#include <iostream>
#include <algorithm>

int main(){

    omp_set_num_threads(5);

    float res=10;

#pragma omp parallel for default(shared) reduction(min:res)
    for(int i1 = 0; i1 <= 10; i1++)
        for(int i0 = 0; i0 <= 10; i0++)
            if(res > 3.0+i1+20*i0)
                res = 3.0+i1+20*i0;

    std::cout << "res = " << res << std::endl;

    return 0;
}

But If I write in an alternative way by replacing "if" statement with "std::min", then the result is wrong: res = 10.

#include <omp.h>
#include <iostream>
#include <algorithm>

int main(){

    omp_set_num_threads(5);

    float res=10;

#pragma omp parallel for default(shared) reduction(min:res)
    for(int i1 = 0; i1 <= 10; i1++)
        for(int i0 = 0; i0 <= 10; i0++)
            res = std::min(res,static_cast<float>(3.0+i1+20*i0));

    std::cout << "res = " << res << std::endl;

    return 0;
}

Is OpenMP min reduction interfering with std::min?

llodds
  • 153
  • 1
  • 2
  • 11
  • Interesting question. I can't test it right now. `reduction(min:` should work since OpenMP 3.1. Another interesting question is if it works with `fmin`. I think I saw an example of this in the manual. `fmin` and `std::min` are not the same thing despite what some people think. – Z boson Sep 09 '16 at 11:56
  • 1
    After reading a bit more the standard, it is explicitly mentioned that `min` should be implemented like this `out=in – Gilles Sep 09 '16 at 11:59
  • 1
    The OpenMP 3.1 manual gives an example on page 266 using `fmaxf` so I think it works for `fmin` and `fmax` as well. I though `std::min` expanded to a ternary operator so I am not sure why it would fail. You would have to look at the macro. – Z boson Sep 09 '16 at 12:04
  • [here](http://stackoverflow.com/a/30915238/2542702) I explained why std::min and fmin are not the same. – Z boson Sep 09 '16 at 12:07
  • `std::min` can be implemented in many different ways http://en.cppreference.com/w/cpp/algorithm/min. What you can do is create a custom Reduction operator with OpenMP4.0 which will work with `std::min`. – Z boson Sep 09 '16 at 12:10
  • I just tested your code and I don't see a problem. What result do you get with `std::min`? What compiler are you using? What are your compiler options? – Z boson Sep 10 '16 at 15:20
  • I am using icpc(16.0.0) on OSX. No other compiler options except '-openmp'. OpenMP version is 201307(4.0). If res is initialized to 100, res = 100. But if res is initialized to be greater than 205, res = 205. fmin outputs the correct result on my mac. – llodds Sep 10 '16 at 20:50

3 Answers3

2

First of all, your code is conforming: it shouldn't matter which kind of code you have inside the parallel for.

What the reduction clause implies is that each thread will have its own private copy initialized to the neutral element of the min operator (i.e. the largest representable number in the reduction item type) and they will work with it until the end of the construct. At that point, these private copies will be reduced to the original list item using the reduction-identifier, which in your case is the min operator. So there is no race-condition here.

I have executed your code using the same version as you did and it worked fine: icpc (ICC) 16.0.0 and OpenMP version 201307. Could this issue be related to the C++ standard headers that you are using?

smateo
  • 177
  • 5
-1

I would swap the loops before drawing a conclusion. In similar context, I have found separate inner and outer loop reduction variables necessary. -std::min works well with icpc but not g++ or msvc

tim18
  • 580
  • 1
  • 4
  • 8
-1

The problem is you have a data race.

In your first example, the minimum value among all the threads is being calculated by the OpenMP runtime: Each thread gets its own res and the runtime determines the minimum value. The runtime makes sure res is read and written correctly by all threads.

In your second example, each thread is calling std::min to determine the minimum value between res and the other value. res is shared because of your default(shared) clause, so all the threads will be trying to use and update res concurrently, which is a data race.

You should stay with your first example. If you want to use std::min, you must prevent the data race using a lock or something similar.

jandres742
  • 191
  • 11
  • 1
    Thanks! But why res within the "if" statement is private while res used in std::min is shared? Here (https://computing.llnl.gov/tutorials/openMP/#REDUCTION) it indicates as long as res is a reduction variable, each thread will create its own private copy. – llodds Sep 08 '16 at 22:12
  • 1
    Oh yes, I didn't notice you were still using the `reduction` clause, redundantly, with the `std::min` inside the loop. In that case, `res` is private to each variable and their copies are never updated/written to the `res` declared before the `parallel` region, which is why `res` after the `parallel` region is 10, which is the value assigned before. – jandres742 Sep 08 '16 at 22:17
  • But why res is updated in the first case? In the second case if I initialize res to 10000, the result would be 205, which corresponds to i0=1,i1=2... – llodds Sep 08 '16 at 22:23