I must to learn OpenMP
sources on gcc
. I have read documentations of OpenMP
(3.0 and 4.0). As I know, OpenMP used work-sharing
mechanism. As I understand work-sharing
mechanism transmits tasks between threads while threads are running. Or does distribution of data between threads is executing before executing these threads?
Does work-sharing mechanism (in OpenMP) transmits tasks between threads WHILE threads are executing?

- 103
- 5
-
Learning to [implement the work-sharing yourself can teach you a lot](http://stackoverflow.com/a/30591616/2542702). – Z boson Apr 13 '16 at 06:18
2 Answers
If you are using OpenMP with tasks, the tasks are stored in one or more task queues. If a thread finds itself idle, it will snoop tasks from a neighboring queue. This is internal to libgomp
.
If you use OpenMP parallel for with a static schedule, no task snooping will take place.
If you use OpenMP parallel for with a dynamic schedule, threads in team will divide the work dynamically, so idle threads will take tasks from the rest of the team.
In general, when threads need to communicate at run-time, cycles are spent away from processing.

- 1,105
- 8
- 22
Complementing @klaas-van-gend answer: in order for a libgomp
thread to start stealing tasks it needs to be idle AND not be in any active taskwait
construct (explicit or implicit).
For example, think of a binary tree representing a task graph. If the thread that created the root node is not fast enough to start running one of its two children, it will be idle until the execution of its child tasks is finished.
This behavior is observed in GCC 9.1.
If we run this code with libgomp
we can observe the behavior thanks to graphviz graph generated. Colors and numbers inside parenthesis represent a core/thread. The number outsid parenthesis indicates the computational weight of task, and the number on edges is the time when task started to run.
As we can see, the core 1 (blue) stayed idle until the end of its
taskwait
construct. Core 0 (white) only stealed task 6 after the end of the taskwait
created by task 4. Same for core 3 (green) and task 12.
However, if we run this code with Clang/LLVM and libomp
implementation, we have a fully work-stealing algorithm. No core is idle at any time. This behavior is observed on Clang 8 :)

- 689
- 2
- 5
- 21