1

I have a large graph with millions of nodes. I want to check if node 'A' is reachable from node 'B' with less than 4 hops. If possible, I want the shortest path. Which is the best way (or algorithm) to solve this issue?

Neeraj
  • 1,612
  • 7
  • 29
  • 47

3 Answers3

7

Note that if the graph is unweighted (as it seems in your question) - a simple and efficient BFS will be enough to find the shortest path from the source to the target.

Also, since you have a single source and a single target - you can apply bi-directional BFS, which is more efficient then BFS.

Algorithm idea: do a BFS search simultaneously from the source and the target: [BFS until depth 1 in both, until depth 2 in both, ....].
The algorithm will end when you find a vertex v, which is in both BFS's front.

Algorithm behavior: The vertex v that terminates the algorithm's run will be exactly in the middle between the source and the target.
This algorithm will yield much better result in most cases then BFS from the source [explanation why it is better then BFS follows], and will surely provide an answer, if one exist.

why is it better then BFS from the source?
assume the distance between source to target is k, and the branch factor is B [every vertex has B edges].
BFS will open: 1 + B + B^2 + ... + B^k vertices.
bi-directional BFS will open: 2 + 2B^2 + 2B^3 + .. + 2B^(k/2) vertices.

for large B and k, the second is obviously much better the the first.


(*) The explanation of bi-directional search is taken from another answer I posted

Community
  • 1
  • 1
amit
  • 175,853
  • 27
  • 231
  • 333
3

The best algorithm for finding the shortest path between two nodes in a graph in which you have no extra information about how likely it is that one node is close to the target is Dijkstra's Algorithm. You can easily modify this algorithm to quit after 3 hops, to avoid wasting computation on results that you are not interested in.

If you do have some extra information about the likelihood that a given node is close to your target, you can use A* search, which uses a heuristic on a node's distance to its target to improve its runtime performance.

Jordan Lewis
  • 16,900
  • 4
  • 29
  • 46
  • Thanks Jordan, I didn't thought about quitting after 3 hops. :) – Neeraj Oct 31 '12 at 06:26
  • 8
    If the graph is not weighted - DIjkstra algorithm is redundant, a simpler and more efficient way to find a shortest path in a not-weighted graph is BFS. Also, you can apply bi-directional BFS if you have a single source and a single target. – amit Oct 31 '12 at 06:59
  • Does Djikstra's also deal with the reachability problem between two given nodes? I am wondering if I need to use BFS first to determine reachability before using Djikstra. – SeesSound Mar 20 '20 at 18:39
1

If you need path less than 3 hops, then all possible paths are A (A=B), A-B (nodes are adjacent), A-X-B (X is a node adjacent to both ends). So there's no need in any complex algorithm. First, test for A=B, second, test that A and B are adjacent, and third, try to find X that is adjacent to both A and B (eg. intesection of endpoint adjacency sets).

Vovanium
  • 3,798
  • 17
  • 23
  • Sorry, The question is less than or equal to 3 hops. Thanks for correcting me. I have edited the question. Thanks. – Neeraj Nov 02 '12 at 04:57
  • If you do not need any investigations deeper, you can still use this approach, thus not needing any complex routing algorithm. Add another step to above: Iterate list of friends of A to make list of friends of friends of A and compute intersection with list of friends of B. I think this can be solved just on database query language, such as SQL without any 'explicit' computations. – Vovanium Nov 04 '12 at 20:49
  • p. s. It is just 'unrolling' of bidirectional BFS algorithm to limited depth. – Vovanium Nov 04 '12 at 20:51
  • Yeah, i don't need to investigate deeper now. In that case which is most efficient? The answer by amit or you? – Neeraj Nov 05 '12 at 07:02
  • My one is more specific, his one is more general. Mine may be more efficient if optimized, but there's no difference in computation complexity. – Vovanium Nov 08 '12 at 08:00