Depth first search in parallel

Question

I am implementing a DFS to find the exit of a maze, and currently it is single threaded.

I am planning on making it more efficient by creating several threads that search the tree using the same single-threaded algorithm, but I am randomizing which direction to go in when I encounter an intersection.

For example, the threads encounter an intersection where they can go East or West. Half of them go East and half go West. This continues until one of the threads finds the solution path.

Is this a valid way to implement DFS in a parallel way?

This will be a search and ought to work fine. It won't be a DFS according to the standard definition, but that doesn't matter. If the maze is a tree, then it's pretty simple to implement. If it's a graph, then synchronization becomes important. — Gene, Oct 30 '16 at 05:29

score 4 · Answer 1 · answered Oct 30 '16 at 19:00

If you do recursive parallel work in Java, use the Fork and Join API introduced in Java 7.

public class MazeNode {
  // This class represents a Path from the start of your maze to a certain node. It can be a) a dead end, b) the exit, c) have child Paths
  ...
}

public class MazeTask extends RecursiveTask<MazeNode>
{
  private MazeNode node;

  MazeTask(MazeNode node) {
    this.node = node;
  }


  // Returns null or the exit node
  @Override
  protected MazeNode compute() {    
  if (node.isDeadEnd())
    return null;
  else if (node.isExit())
    return node;
  else { // node has ways to go
    // implement as many directions as you want
    MazeTask left = new MazeTask(node.getLeft());
    MazeTask right = new MazeTask(node.getRight());

    left.fork(); // calculate in parallel

    MazeNode rightNode = right.compute(); // calculate last Task directly to save threads
    MazeNode leftNode = left.join(); // Wait for the other task to complete and get result

    if (rightNode != null)
      return rightNode;
    else
      return leftNode; // This assumes there is only one path to exit
  }
}


public static void main(String[] args) {
  MazeNode maze = ...
  MazeNode exit = new ForkJoinPool().invoke(new MazeTask(maze));
}

rsutormin · Answer 2 · 2016-10-30T17:02:37.353

0

[UPDATE1]

Here is my proposal with thread synchronization (but based on our discussion with @IraBaxter I'm not sure anymore that my way gives any advantages):

When algorithm starts only one thread is needed until you get to first fork. When this one thread gets there it should put all possible outcomes (left, right, middle) to stack and stop itself. Then since there are some elements in stack several threads are activated to start from edges stored in stack. When each of these threads reaches a fork all outcomes are put to the stack and threads stop themselves (not all at once, each does when it needs) and take edges from stack. And so on. Every time when any thread is stopped (regardless because of fork or dead end) it switched to mode waiting for edges in stack (or takes one if stack is not empty). And every time when some edge is added to the stack threads are notified about non-emptiness of the stack.

I used term "edge" here in a meaning of position of fork plus direction where to go from given fork. Stack gives you depth-first property of algorithm.

PS: It's possible to optimize this approach by reducing number of synchronization points. We can collect forking edges in separate worklist for every thread and don't stop this thread until it reaches dead-end. We exclude from this worklist those edges where this thread decided to go on each fork. Then we migrate local worklist to global one when thread reaches dead-end. So synchronization is used for idle threads to start from points from global worklist.

edited Oct 30 '16 at 17:02

answered Oct 30 '16 at 04:43

rsutormin

1,629
2
17
21

You are accumulating branches "in the stack" (I think you mean an arbitrary set of unexplored branches; this is normally called a "worklist"). To do that in a threadsafe way, you'll need synchronization of the set accesses. If the cost of generating a branch is small (it is for N-puzzles and even chess) the synchronization costs will swamp this algorithm and a serial DFS will be faster. – Ira Baxter Oct 30 '16 at 11:12
@IraBaxter, you may be right. So what if we reduce number of synchronization points? We can collect forking edges in separate worklist for every thread and don't stop this thread until it reaches dead-end. We exclude from this worklist those edges where this thread decided to go on each fork. Then we migrate local worklist to global one when thread reaches dead-end. So synchronization is used for idle threads to start from points from global worklist. – rsutormin Oct 30 '16 at 16:37
What is more sensible IMHO is to start a parallel search at the top of the search tree, and fork for each found branch. Once enough top-level forks have happened to keep all processors busy, simply let each processor run a DFS on the subtree it has. So you have your worklist (with synch costs) at the top of the tree, but in subtrees assigned to processors, DFS is run (as cheaply as the single threaded case). The synch overhead ratio goes to vanishingly small because search trees are typically exponential in size. – Ira Baxter Oct 30 '16 at 16:47
@IraBaxter, but the problem in your proposal is that tree may not be balanced so if some thread is done with its sub-tree it can not help others. – rsutormin Oct 30 '16 at 16:52
No time right now to describe a solution, but here's a link to background ideas and then to a working parallel DFS: http://stackoverflow.com/a/1344944/120163 – Ira Baxter Oct 30 '16 at 17:10

Depth first search in parallel

2 Answers2

Linked