Implementing parallel breadth first search using openmp

Question

I want to implement parallel breadth first traversal using openmp.

I read Parallelizing a Breadth-First Search. I am just trying to print the breadth-first traversal. But the code in the link provided above has almost all the traversal code in the critical section.

If no two threads can be in the critical section at the same time, then it will take same amount of time as the sequential program(may take even more time). How can I use OpenMP to run the algorithm in parallel?

Did you actually try it? I mean, did you actually implement a such code and measure the speed-up/slow-down OpenMP induces? — Gilles, Nov 06 '19 at 06:46
No, I did not try. But as far as I know, only one thread can enter into the critical section. And if only one thread can enter into the critical section at a time, then the program will run equivalent to a serial problem and the overhead of creating threads will further increase the overhead. The code was completely similary to the link provided above. The person in the above post wanted to perform some intensive computation in each iteration(so parallel computing can help). But for traversal everything is inside the critical section. — satinder singh, Nov 06 '19 at 07:05
Please note that you could have just asked this as a comment to the answer and I'd have clarified it. But I'm gonna take this as a new question and answer it here. — Zulan, Nov 06 '19 at 08:32

score 1 · Accepted Answer · answered Nov 06 '19 at 08:38

Your premise is misleading:

the code [...] has almost all the traversal code in the critical section.

std::queue<node*> q;
q.push(head);
while (!q.empty()) {
  qSize = q.size();
  #pragma omp parallel for
  for (int i = 0; i < qSize; i++) {
    node* currNode;
    #pragma omp critical
    {
      currNode = q.front();
      q.pop();
    }
    doStuff(currNode);
    #pragma omp critical
    q.push(currNode);
  }
}

Sure, the traversal itself is in fact in a critical section, as it must be if you are using a non-thread-safe datastructure. However, the premise of this question was:

The processing function doStuff() is quite expensive

As long as this holds true, it is not an issue that the remaining code is in a critical section. For instance you could use Amdahl's law to compute the theoretically achievable speedup.

All that said, if your doStuff is very cheap, your observation is of course true. I would then recommend using a search that does not require a shared queue such as depth-first search or iterative deepening depth-first search.

Sir, I mentioned above, that the "traversal" code was in the critical section. doStuff() is not used for traversal. — satinder singh, Nov 06 '19 at 08:55
Why do we need both `q.front()` and `q.pop()`? Why can't we set `currNode` directly with `currNode = q.pop();`? — pretzlstyle, Oct 24 '22 at 17:34

Implementing parallel breadth first search using openmp

1 Answers1