98

I have a undirected graph with about 100 nodes and about 200 edges. One node is labelled 'start', one is 'end', and there's about a dozen labelled 'mustpass'.

I need to find the shortest path through this graph that starts at 'start', ends at 'end', and passes through all of the 'mustpass' nodes (in any order).

( http://3e.org/local/maize-graph.png / http://3e.org/local/maize-graph.dot.txt is the graph in question - it represents a corn maze in Lancaster, PA)

Rahul Bhobe
  • 4,165
  • 4
  • 17
  • 32
Daniel
  • 2,032
  • 5
  • 21
  • 27

11 Answers11

83

Everyone else comparing this to the Travelling Salesman Problem probably hasn't read your question carefully. In TSP, the objective is to find the shortest cycle that visits all the vertices (a Hamiltonian cycle) -- it corresponds to having every node labelled 'mustpass'.

In your case, given that you have only about a dozen labelled 'mustpass', and given that 12! is rather small (479001600), you can simply try all permutations of only the 'mustpass' nodes, and look at the shortest path from 'start' to 'end' that visits the 'mustpass' nodes in that order -- it will simply be the concatenation of the shortest paths between every two consecutive nodes in that list.

In other words, first find the shortest distance between each pair of vertices (you can use Dijkstra's algorithm or others, but with those small numbers (100 nodes), even the simplest-to-code Floyd-Warshall algorithm will run in time). Then, once you have this in a table, try all permutations of your 'mustpass' nodes, and the rest.

Something like this:

//Precomputation: Find all pairs shortest paths, e.g. using Floyd-Warshall
n = number of nodes
for i=1 to n: for j=1 to n: d[i][j]=INF
for k=1 to n:
    for i=1 to n:
        for j=1 to n:
            d[i][j] = min(d[i][j], d[i][k] + d[k][j])
//That *really* gives the shortest distance between every pair of nodes! :-)

//Now try all permutations
shortest = INF
for each permutation a[1],a[2],...a[k] of the 'mustpass' nodes:
    shortest = min(shortest, d['start'][a[1]]+d[a[1]][a[2]]+...+d[a[k]]['end'])
print shortest

(Of course that's not real code, and if you want the actual path you'll have to keep track of which permutation gives the shortest distance, and also what the all-pairs shortest paths are, but you get the idea.)

It will run in at most a few seconds on any reasonable language :)
[If you have n nodes and k 'mustpass' nodes, its running time is O(n3) for the Floyd-Warshall part, and O(k!n) for the all permutations part, and 100^3+(12!)(100) is practically peanuts unless you have some really restrictive constraints.]

ShreevatsaR
  • 38,402
  • 17
  • 102
  • 126
  • 7
    Given the small input size, I agree with the answer. But I'm interested in why you reject others' assertions that this is a case of TSP. The way I understand it, you could extract all the must-pass nodes and use it to create a sub-graph. The edges between nodes in the sub-graph have weights corresponding to the APSP solution on the original graph. Then doesn't the question become an instance of TSP on the subgraph? Your solution appears to be a brute-force solution to **that** problem (which is fine). – maditya Apr 05 '13 at 07:41
  • 8
    @maditya: Firstly, I hope you agree that (to quote Steven A Lowe's comment on another answer) an answer like "TSP is hard, bwahahaha" is not an appropriate answer to someone who has a real problem to solve, especially one very easily solved on any computer from the last few decades. Secondly, this is not identical to the TSP for trivial reasons (different input format): the tiny instance of TSP it contains is for a smaller graph, not one of input size N. So NP-completeness depends on how many 'mustpass' nodes there are asymptotically: if it's always 12, or O(log N), it's not NP-complete, etc. – ShreevatsaR Apr 08 '13 at 14:14
  • 2
    I am not sure if the result would be a path. Imagine having to go from `a` to `c` through `b`. It may be that the shortest paths from `b` to `a` and `c` share an edge. In that case, the edge would be repeated twice. One of the two paths would need to be worse than the optimum in order not to generate cycles. – Pietro Saccardi Jun 11 '18 at 21:03
  • 1
    @PietroSaccardi From the description in the question it seems that the goal is simply to find the shortest "way" of going through all those nodes, and it may be ok if some edge is repeated. That is, “path” is being used in a loose sense. In fact, if repeated edges are not allowed, there may not even be an answer (e.g. consider a graph B—A—C where you're asked to go from A to C while passing through B: there's no way not to repeat the B—A edge). – ShreevatsaR Jun 11 '18 at 21:29
  • 2
    The Floyd-Warshall algorithm is not implemented correctly here: since all cells of the array are initialized with `INF`, the line `d[i][j] = min(d[i][j], d[i][k] + d[k][j])` becomes `d[i][j] = min(INF, INF + INF)` and all cells always remain equal to `INF`. You need to add a step to fill this array with the edge lengths from the graph. – Stef Aug 24 '20 at 13:58
  • There is actually a more optimal approach. To avoid considering every permutation, we can define dp[S][a][b] for every subset of mustpass nodes and every possible pair of start and end vertices in S. This will represent the most optimal way to visit all the nodes in S starting from a and finishing in b. There are a total of (2^12 * 12^2) states and to compute each one, we only need to consider 12 possibilities for the second node to visit in the subset. Overall we have O(2^K * K^3) where K is the number of mustpass nodes (including start and finish nodes) – Mushegh Oct 10 '22 at 02:13
31

run Djikstra's Algorithm to find the shortest paths between all of the critical nodes (start, end, and must-pass), then a depth-first traversal should tell you the shortest path through the resulting subgraph that touches all of the nodes start ... mustpasses ... end

Steven A. Lowe
  • 60,273
  • 18
  • 132
  • 202
  • This seems like it has potential to be more efficient than the selected solution. I bet if you had fleshed it out a bit more it would have scored higher. I first had to think about whether a shortest path between all pairs guarantees a path between all required nodes.. but I believe it would (assuming undirected). – galactikuh Jun 03 '16 at 15:48
  • 1
    I think this doesn't work if a path must not repeat edges. – tuket Aug 22 '16 at 15:28
  • 1
    How would the depth-first traversal find the shortest path in the example of Advent of Code day 24? http://adventofcode.com/2016/day/24 – user2609980 Feb 11 '17 at 23:04
19

This is two problems... Steven Lowe pointed this out, but didn't give enough respect to the second half of the problem.

You should first discover the shortest paths between all of your critical nodes (start, end, mustpass). Once these paths are discovered, you can construct a simplified graph, where each edge in the new graph is a path from one critical node to another in the original graph. There are many pathfinding algorithms that you can use to find the shortest path here.

Once you have this new graph, though, you have exactly the Traveling Salesperson problem (well, almost... No need to return to your starting point). Any of the posts concerning this, mentioned above, will apply.

Andrew Top
  • 2,517
  • 1
  • 17
  • 10
15

Actually, the problem you posted is similar to the traveling salesman, but I think closer to a simple pathfinding problem. Rather than needing to visit each and every node, you simply need to visit a particular set of nodes in the shortest time (distance) possible.

The reason for this is that, unlike the traveling salesman problem, a corn maze will not allow you to travel directly from any one point to any other point on the map without needing to pass through other nodes to get there.

I would actually recommend A* pathfinding as a technique to consider. You set this up by deciding which nodes have access to which other nodes directly, and what the "cost" of each hop from a particular node is. In this case, it looks like each "hop" could be of equal cost, since your nodes seem relatively closely spaced. A* can use this information to find the lowest cost path between any two points. Since you need to get from point A to point B and visit about 12 inbetween, even a brute force approach using pathfinding wouldn't hurt at all.

Just an alternative to consider. It does look remarkably like the traveling salesman problem, and those are good papers to read up on, but look closer and you'll see that its only overcomplicating things. ^_^ This coming from the mind of a video game programmer who's dealt with these kinds of things before.

Blank
  • 7,088
  • 12
  • 49
  • 69
8

This is not a TSP problem and not NP-hard because the original question does not require that must-pass nodes are visited only once. This makes the answer much, much simpler to just brute-force after compiling a list of shortest paths between all must-pass nodes via Dijkstra's algorithm. There may be a better way to go but a simple one would be to simply work a binary tree backwards. Imagine a list of nodes [start,a,b,c,end]. Sum the simple distances [start->a->b->c->end] this is your new target distance to beat. Now try [start->a->c->b->end] and if that's better set that as the target (and remember that it came from that pattern of nodes). Work backwards over the permutations:

  • [start->a->b->c->end]
  • [start->a->c->b->end]
  • [start->b->a->c->end]
  • [start->b->c->a->end]
  • [start->c->a->b->end]
  • [start->c->b->a->end]

One of those will be shortest.

(where are the 'visited multiple times' nodes, if any? They're just hidden in the shortest-path initialization step. The shortest path between a and b may contain c or even the end point. You don't need to care)

bjorke
  • 3,295
  • 1
  • 16
  • 20
  • Visited only once is required by the condition of it being a shortest path. – Aziuth Mar 09 '16 at 09:15
  • Huh. I was quite certain a minute ago, but yeah, you're right. Obviously not in a tree with several branches. But, thing is, if we abstractify the problem into a new graph that contains only the mustpass nodes that is fully connected, with the nodes having the distance of the shortest path in the original graph, we arrive at TSP. So, are you sure that it's not NP-hard? I'd assume that in the general problem, the amount of mustpass nodes depends on the amount of total nodes, and if, for example, the total are a polynomial of the mustpass, we get NP-hardness, right? – Aziuth Mar 24 '16 at 08:56
  • The route from, say, a->b may pass through c. So no brnach prevents any other. It's just permutation. – bjorke Mar 25 '16 at 00:57
  • Yes? But permutation is O(n!) if we assume that the amount of mustpass nodes has some connection to the amount of total nodes, like "total nodes are a polynomial of mustpass nodes". You just solved TSP by brute force. – Aziuth Mar 29 '16 at 08:01
6

Andrew Top has the right idea:

1) Djikstra's Algorithm 2) Some TSP heuristic.

I recommend the Lin-Kernighan heuristic: it's one of the best known for any NP Complete problem. The only other thing to remember is that after you expanded out the graph again after step 2, you may have loops in your expanded path, so you should go around short-circuiting those (look at the degree of vertices along your path).

I'm actually not sure how good this solution will be relative to the optimum. There are probably some pathological cases to do with short circuiting. After all, this problem looks a LOT like Steiner Tree: http://en.wikipedia.org/wiki/Steiner_tree and you definitely can't approximate Steiner Tree by just contracting your graph and running Kruskal's for example.

Ying Xiao
  • 1,729
  • 1
  • 9
  • 5
2

Considering the amount of nodes and edges is relatively finite, you can probably calculate every possible path and take the shortest one.

Generally this known as the travelling salesman problem, and has a non-deterministic polynomial runtime, no matter what the algorithm you use.

http://en.wikipedia.org/wiki/Traveling_salesman_problem

Rafe
  • 8,467
  • 8
  • 47
  • 67
2

The question talks about must-pass in ANY order. I have been trying to search for a solution about the defined order of must-pass nodes. I found my answer but since no question on StackOverflow had a similar question I'm posting here to let maximum people benefit from it.

If the order or must-pass is defined then you could run dijkstra's algorithm multiple times. For instance let's assume you have to start from s pass through k1, k2 and k3 (in respective order) and stop at e. Then what you could do is run dijkstra's algorithm between each consecutive pair of nodes. The cost and path would be given by:

dijkstras(s, k1) + dijkstras(k1, k2) + dijkstras(k2, k3) + dijkstras(k3, 3)

Hissaan Ali
  • 2,229
  • 4
  • 25
  • 51
0

One thing that is not mentioned anywhere, is whether it is ok for the same vertex to be visited more than once in the path. Most of the answers here assume that it's ok to visit the same edge multiple times, but my take given the question (a path should not visit the same vertex more than once!) is that it is not ok to visit the same vertex twice.

So a brute force approach would still apply, but you'd have to remove vertices already used when you attempt to calculate each subset of the path.

kirsch
  • 59
  • 6
0

I tried simulated annealing, branch&bound, ant colony algorithms, all of them takes unreal amount of time to solve even medium size graphs.

So I made fast responding solution for a big graphs (gamedev viable for runtime calculation).

Algorithm is published and documented here(with animations): https://github.com/Stridemann/TravelShortPathFinder

If you already have a graph (and in your case you have), just skip the space segmentation algorithm part.

Stridemann
  • 41
  • 6
  • While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - [From Review](/review/late-answers/34043151) – ryanc Mar 22 '23 at 19:43
0

How about using brute force on the dozen 'must visit' nodes. You can cover all the possible combinations of 12 nodes easily enough, and this leaves you with an optimal circuit you can follow to cover them.

Now your problem is simplified to one of finding optimal routes from the start node to the circuit, which you then follow around until you've covered them, and then find the route from that to the end.

Final path is composed of :

start -> path to circuit* -> circuit of must visit nodes -> path to end* -> end

You find the paths I marked with * like this

Do an A* search from the start node to every point on the circuit for each of these do an A* search from the next and previous node on the circuit to the end (because you can follow the circuit round in either direction) What you end up with is a lot of search paths, and you can choose the one with the lowest cost.

There's lots of room for optimization by caching the searches, but I think this will generate good solutions.

It doesn't go anywhere near looking for an optimal solution though, because that could involve leaving the must visit circuit within the search.

justinhj
  • 11,147
  • 11
  • 58
  • 104