1

I'm new to OpenMP. When I parallelize a for loop using

  #pragma omp parallel for num_threads(4)
  for(i=0;i<4;i++){
    //some parallelizable code
  }

Is it guaranteed that every thread takes one and only one value of i? How is the loop work divided among the threads in general when num_threads is not equal to or does not evenly divide the total number of times of the for loop? Is there a command I can use to specify that each thread takes only one value of i, or the number of values of i each thread takes?

Zhuoran He
  • 873
  • 9
  • 15
  • Hi, look at the schedule kw, https://msdn.microsoft.com/de-de/library/x5aw0hdf(v=vs.90).aspx – vadikrobot Sep 12 '16 at 07:25
  • As Microsoft supports only OpenMP 2.0, most other OpenMP implementations will have additional useful options. – tim18 Sep 12 '16 at 13:24
  • I find it constructive to [implement the scheduling yourself](https://stackoverflow.com/a/30591616/2542702). – Z boson Sep 13 '16 at 09:29

1 Answers1

4

The work division in a loop construct is decided by the schedule. If no schedule clause is present, the def-sched-var schedule is used, which is implementation defined.

You could use schedule (static, 1), which in your case guarantees that each thread will get exactly one value.

I highly recommend to take a look at the OpenMP specification, Table 2.5 and 2.7.1.1.

There may be legitimate reasons for making this kind of assumptions, but in general the correctness of your loop code should not depend on this. Primarily I would treat this as a performance-hint.

Depending on your use-case you may want to consider tasks or just parallel constructs. If you rely such details for loops, make sure it is well specified in the standard, and not just works in your particular implementation.

Zulan
  • 21,896
  • 6
  • 49
  • 109
  • Default schedule is expected to use maximum chunk sizes. – tim18 Sep 12 '16 at 13:23
  • Thanks. I found a [link](http://cs.umw.edu/~finlayson/class/fall16/cpsc425/notes/12-scheduling.html) that gives me a code to check the schedule. The default scheduler should be static and will apportion at compile time the number of `i` values to the threads as evenly as possible. This is ensuring. – Zhuoran He Sep 12 '16 at 17:45
  • Such experiments are great to learn how stuff works. But make sure that assumptions in your portable, future-proof code are based only on the standard. – Zulan Sep 13 '16 at 08:02