4

Let's suppose that I want want to populate in parallel a std::vector object in an ordered way, like this:

std::vector<T> v;
#pragma omp parallel for ordered
for (int i=0;i<n;i++){
    T result = //some expensive fun here...
    #pragma omp ordered
    v.push_back(result);
}

As you can see, the instruction v.push_back(result) doesn't depend on i.

My question is: v will still be populated in an ordered way according to i?

justHelloWorld
  • 6,478
  • 8
  • 58
  • 138
  • 3
    According to (my understanding of) the OpenMP standard, yes, the order is kept despite the loop index not being explicitly used. However, for this code to have any interest, you'd better use either a `schedule( dynamic )` or a `schedule( static, 1 )`. Better even, remove `ordered` altogether, declare `v` to be of size `n` and use `v[i] = result;` – Gilles Jul 11 '16 at 07:31
  • +1 for the trick, thanks so much. Anyway, I'm sorry but I have to say that your answer has a lack of reference: even if you're understanding of OpenMP is for sure better than mine, it could be wrong :) I'm sure that you understand what I mean. – justHelloWorld Jul 11 '16 at 07:42
  • 1
    I do and that's why I said "my understanding". It is however based of my reading of the [current OpenMP standard](http://www.openmp.org/mp-documents/openmp-4.5.pdf), section 2.13.8. This part is unfortunately a bit too convoluted to really extract a quote that all by itself gives a definitive answer. However, the whole section comforts me in my believing that your code is correct. – Gilles Jul 11 '16 at 07:51
  • Gilles suggestion to pre-size the vector and then index into it is clearly a good solution. Not only does it allow you to have real parallelism, it's also potentially more efficient since the vector won't need to be dynamically re-sized. – Jim Cownie Jul 11 '16 at 13:48
  • 1
    `ordered` works as you expect it to (OpenMP 4.0, section 2.13.8): _"The threads in the team executing the loop region execute `ordered` regions sequentially in the order of the loop iterations. When the thread executing the first iteration of the loop encounters an `ordered` construct, it can enter the `ordered` region without waiting. When a thread executing any subsequent iteration encounters an `ordered` region, it waits at the beginning of that `ordered` region until execution of all the `ordered` regions belonging to all previous iterations have completed."_ – Hristo Iliev Jul 11 '16 at 13:51
  • 1
    OpenMP 4.5 overloads `ordered` with an additional function and having both described in the same section leads to the convoluted text observed by Gilles. As for the need to change the loop scheduling to get better performance, see [here](http://stackoverflow.com/a/13230816/1374437). I would still go with a preallocated vector and `operator[]` as suggested by both Gilles and Jim Cownie. – Hristo Iliev Jul 11 '16 at 13:55

0 Answers0