24

I've been tasked to write an implementation of the A* algorithm (heuristics provided) that will solve the travelling salesman problem. I understand the algorithm, it's simple enough, but I just can't see the code that implements it. I mean, I get it. Priority queue for the nodes, sorted by distance + heuristic(node), add the closest node on to the path. The question is, like, what happens if the closest node can't be reached from the previous closest node? How does one actually take a "graph" as a function argument? I just can't see how the algorithm actually functions, as code.

I read the Wikipedia page before posting the question. Repeatedly. It doesn't really answer the question- searching the graph is way, way different to solving the TSP. For example, you could construct a graph where the shortest node at any given time always results in a backtrack, since two paths of the same length aren't equal, whereas if you're just trying to go from A to B then two paths of the same length are equal.

You could derive a graph by which some nodes are never reached by always going closest first.

I don't really see how A* applies to the TSP. I mean, finding a route from A to B, sure, I get that. But the TSP? I don't see the connection.

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
Puppy
  • 144,682
  • 38
  • 256
  • 465
  • 1
    [A*](http://en.wikipedia.org/wiki/A*_search_algorithm) seems to be a pretty good summary from what I can remember from a CS course. [Dijkstra algorithm](http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm) is very similar (but simpler) so it might be nicer to start with. A priority queue is handy in both cases. –  Dec 15 '10 at 18:30
  • 7
    @pst: A* and Dijkstra's algorithm are useful if you want to go from point A to point B. If you want to go from point A to point A by a path with specific constraints, well, that's something else. – David Thornley Dec 15 '10 at 18:56
  • 1
    When I was in Uni (during the previous millennium) we had an assignment to implement A* in any language we wanted, most picked C++ which we were all most familiar with but I chose Prolog since it seemed a better fit to the problem. Short story long, I finished the assignment much faster than most people, you could perhaps start in Prolog and skip the intermediate phase. – Motti Dec 15 '10 at 19:54
  • 1
    "[...] or language specific code like Boost." Boost::graph has an implementation of A* – Alexandre C. Dec 17 '10 at 17:24
  • possible duplicate of [How can A* algorithm be applied to traveling salesman problem?](http://stackoverflow.com/questions/5344705/how-can-a-algorithm-be-applied-to-traveling-salesman-problem) – Fred Foo Mar 17 '11 at 21:55
  • 3
    Which is being closed as a duplicate of this.... – ChrisF Mar 17 '11 at 21:59
  • @ChrisF: I decided to vote for close on both, just to see what happens. This question seems to be winning, though. – Fred Foo Mar 17 '11 at 22:11

6 Answers6

12

I found a solution here

Use minimum spanning tree as a heuristic.

Set Initial State: Agent in the start city and has not visited any other city

Goal State: Agent has visited all the cities and reached the start city again

Successor Function: Generates all cities that have not yet visited

Edge-cost: distance between the cities represented by the nodes, use this cost to calculate g(n).

h(n): distance to the nearest unvisited city from the current city + estimated distance to travel all the unvisited cities (MST heuristic used here) + nearest distance from an unvisited city to the start city. Note that this is an admissible heuristic function.   You may consider maintaining a list of visited cities and a list of unvisited cities to facilitate computations.

user972616
  • 1,324
  • 3
  • 18
  • 38
  • It is possible to construct problematic graphs where the MST behaves counterintuitively. Say you have a graph with N cities left to visit. Using the MST you can get a lower bound for the remaining distance to goal. Then remove a city, leaving N-1 cities to visit. It is possible that the remaining distance to goal is now larger! This problem bit me this week as I was trying to use a minimum spanning tree in my A* nearest neighbor search. My solution was to subtract a bonus times the number of cities visited to smooth out the function. – Paul Chernoch Nov 09 '17 at 22:17
  • This heuristic is not even admissible. Since you remove some edges when constructing MST, distance between two cities in MST might be larger than distance between two cities in actual graph. – unlut Sep 26 '19 at 19:08
4

The confusion here is that the graph on which you are trying to solve the TSP is not the graph you are performing an A* search on.

See related: Sudoku solving algorithm C++

To solve this problem you need to:

  • Define your:
    • TSP states
    • TSP initial state
    • TSP goal state(s)
    • TSP state successor function
    • TSP state heuristic
  • Apply a generic A* solver to this TSP state graph

A quick example I can think up:

  • TSP states: list of nodes (cities) currently in the TSP cycle
  • TSP initial state: the list containing a single node, the travelling salesman's home town
  • TSP goal state(s): a state is a goal if it contains every node in the graph of cities
  • TSP successor function: can add any node (city) that isn't in the current cycle to the end of the list of nodes in the cycle to get a new state
    • The cost of the transition is equal to the cost of the edge you're adding to the cycle
  • TSP state heuristic: you decide
Community
  • 1
  • 1
Timothy Shields
  • 75,459
  • 18
  • 120
  • 173
3

If it's just a problem of understanding the algorithm and how it works you might want to consider drawing a graph on paper, assigning weights to it and drawing it out. Also you can probably find some animations that show Dijkstra's shortest path, Wikipedia has a good one. The only difference between Dijkstra and A* is the addition of the heuristic, and you stop the search as soon as you reach the target node. As far as using it to solve the TSP, good luck with that!

SND
  • 1,552
  • 2
  • 16
  • 29
fbl
  • 2,840
  • 3
  • 33
  • 41
  • 1
    If the implication is that it will be difficult to solve the TSP with A*, it isn't. I would agree with the comment above that starting with Dijkstra's might be easier, but both are pretty trivial once you decide on the implementation details. – Mike Linington Dec 15 '10 at 18:38
  • 7
    I guess I'm not seeing how A* helps you SOLVE the TSP. By only allowing the nearest unvisited neighbor to be chosen you will get a valid route, but it won't necessarily be the shortest one. The TSP (as it has always been stated to me) wants the shortest non-repeating path. Maybe I'm mis-interpreting the OP's goals. – fbl Dec 15 '10 at 19:58
1

Think about this a little more abstractly. Forget about A* for a moment, it's just dijkstra's with a heuristic anyway. Before, you wanted to get from A to B. What was your goal? To get to B. The goal was to get to B with the least cost. At any given point, what was your current "state"? Probably just your location on the graph.

Now, you want to start at A, then go to both B and C. What is your goal now? To pass over both B and C, maintaining least cost. You can generalize this with more nodes: D, E, F, ... or just N nodes. Now, at any given point, what is your current "state"? This is critical: it ISN'T just your location in the graph--it's also which of B or C or whatever nodes you have visited so far in the search.

Implement your original algorithm so that it calls some function asking if it has reached "the goal state" after making X move. Before, the function would have just said "yes, you're at state B, therefore you are at the goal". But now, let that function return "yes, you're at the goal state" if the search's path has passed over each of the points of interest. It'll know whether or not the search has passed over all points of interest because that's included in the current state.

After you get that, improve the search with some heuristic, and A* it up.

Robz
  • 1,747
  • 5
  • 21
  • 35
0

The question is, like, what happens if the closest node can't be reached from the previous closest node?

This step isn't necessary. As in, you aren't computing a path from the previous closest to the current closest, you are trying to get to your goal node, and the current closest is the only thing that matters (e.g. the algorithm doesn't care that last iteration you were 100km away, because this iteration you are only 96km away).

As a broad introduction, A* doesn't directly construct a path: it explores until it definitely knows that the path is contained within the region it has explored, and then constructs the path based on the information recorded during the exploration.

(I'm going to use the code in the Wikipedia article as a reference implementation to aid my explanation.)

You have a two sets of nodes: closedset and openset

closedset holds nodes that have been fully evaluated, that is, you know exactly how far they are from start and all their neighbours are in one of the two sets. This there is no more computation you can do with them and so we can (sort of) ignore them. (Basically these are completely contained within the border.)

openset holds "border" nodes, you know how far these are from start, but you haven't touched their neighbours yet, so they are on the edge of your search so far.

(Implicitly, there is a third set: completely untouched nodes. But you don't really touch them until they are in openset so they don't matter.)

At a given iteration, if you've got nodes to explore (that is, nodes in openset), you need to work out which one to explore. This is the job of the heuristic, it basically gives you a hint about which point on the border will be the best to explore next by telling you which node it thinks will have the shortest path to goal.

The previous closest node is irrelevant, it just expanded the border a bit, adding new nodes to openset. These new nodes are now candidates for the closest node in this iteration.

At first, openset only contains start, but then you iterate and at each step the border is expanded a little (in the most promising direction), until you eventually reach goal.

When A* is actually doing the exploration, it doesn't worry about which nodes came from where. It doesn't need to, because it knows their distance from start and the heuristic function and that's all it needs.

However to reconstruct the path later, you need to have some record of the path, this is what camefrom is. For a given node, camefrom links it to the node that is closest to start, so you can reconstruct the shortest path by following the links backwards from goal.

How does one actually take a "graph" as a function argument?

By passing one of the representations of a graph.

I don't really see how A* applies to the TSP. I mean, finding a route from A to B, sure, I get that. But the TSP? I don't see the connection.

You need a different heuristic and a different end condition: goal is no longer a single node any more, but the state of having everything connected; and your heuristic is some estimate of the length of the shortest path connecting the remaining nodes.

huon
  • 94,605
  • 21
  • 231
  • 225
0

To answer one of your questions...

To pass a graph as a function argument, you have several options. You could pass a pointer to an array containing all the nodes. You could pass just the one starting node and work from there, if it's a fully connected graph. And finally, you could write a graph class with whatever data structures you need inside it, and pass a reference to an instance of that class.

As for your other question about closest nodes, isn't part of A* search that it will backtrack as needed? Or you could implement your own sort of backtracking to handle that kind of situation.

DGH
  • 11,189
  • 2
  • 23
  • 24