I'm studying OpenMP's scheduling and specifically the different types. I understand the general behavior of each type, but clarification would be helpful regarding when to choose between dynamic
and guided
scheduling.
Intel's docs describe dynamic
scheduling:
Use the internal work queue to give a chunk-sized block of loop iterations to each thread. When a thread is finished, it retrieves the next block of loop iterations from the top of the work queue. By default, the chunk size is 1. Be careful when using this scheduling type because of the extra overhead involved.
It also describes guided
scheduling:
Similar to dynamic scheduling, but the chunk size starts off large and decreases to better handle load imbalance between iterations. The optional chunk parameter specifies them minimum size chunk to use. By default the chunk size is approximately loop_count/number_of_threads.
Since guided
scheduling dynamically decreases the chunk size at runtime, why would I ever use dynamic
scheduling?
I've researched this question and found this table from Dartmouth:
guided
is listed as having high
overhead, while dynamic
has medium overhead.
This initially made sense, but upon further investigation I read an Intel article on the topic. From the previous table, I theorized guided
scheduling would take longer because of the analysis and adjustments of the chunk size at runtime (even when used correctly). However, in the Intel article it states:
Guided schedules work best with small chunk sizes as their limit; this gives the most flexibility. It’s not clear why they get worse at bigger chunk sizes, but they can take too long when limited to large chunk sizes.
Why would the chunk size relate to guided
taking longer than dynamic
? It would make sense for the lack of "flexibility" to cause performance loss through locking the chunk size too high. However, I would not describe this as "overhead", and the locking problem would discredit previous theory.
Lastly, it's stated in the article:
Dynamic schedules give the most flexibility, but take the biggest performance hit when scheduled wrong.
It makes sense for dynamic
scheduling to be more optimal than static
, but why is it more optimal than guided
? Is it just the overhead I'm questioning?
This somewhat related SO post explains NUMA related to the scheduling types. It's irrelevant to this question, since the required organization is lost by the "first come, first served" behavior of these scheduling types.
dynamic
scheduling may be coalescent, causing performance improvement, but then the same hypothetical should apply to guided
.
Here's the timing of each scheduling type across different chunk sizes from the Intel article for reference. It's only recordings from one program and some rules apply differently per program and machine (especially with scheduling), but it should provide the general trends.
EDIT (core of my question):
- What affects the runtime of
guided
scheduling? Specific examples? Why is it slower thandynamic
in some cases? - When would I favor
guided
overdynamic
or vice-versa? - Once this has been explained, do the sources above support your explanation? Do they contradict at all?