7

I have a list of 2D points for example:

1,1 2,2 1,3 4,5 2,1

The distance between these points is known (using math.hypot for example.) I want to sort the list so that there is a minimum distance between them. I'm OK with any possible solution order, as long as the points are in the shortest order.

What is the most pythonic way to achieve this?

I was considering working out the distance between any item and any other item, and choosing the smallest each time, but this would be a slow algorithm on the lists I am working on (1,000 items would not be unusual.)

Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
Thomas O
  • 6,026
  • 12
  • 42
  • 60
  • May be similar to [Shortest distance between points algorithm](http://stackoverflow.com/questions/1602164/shortest-distance-between-points-algorithm), i.e. sort list by X co-ordinate first (fast) then bubble sort according to distance. – PP. Aug 16 '13 at 19:13
  • 1
    «sort the list so that there is a minimum distance between them» "between _them_" meaning "between successive points"? – jscs Aug 16 '13 at 19:27
  • 6
    If you're trying to minimize the distance between successive points, you're asking about the traveling salesman problem, which has no efficient solution; you just have to try all orderings and see which one is shortest. – Josh Buell Aug 16 '13 at 19:29
  • Are the points always integers on both axes? are they limited to some range? – Bitwise Aug 16 '13 at 19:36
  • 1
    @JoshBuell this is actually TSP on a plane with Euclidean distances. Unfortunately it has been shown that even this "simpler" version is NP-hard. Also note that NP-hard does not mean you cannot improve on the solution of trying all orderings. – Bitwise Aug 16 '13 at 19:38

2 Answers2

7

The technical question you're asking is similar to "What is the minimum hamiltonian path of a graph" (your tuples are vertices, and the distance between them are the weight of the edges). This problem can't be solved in polynomial time, so your dataset had better be small. Since your graph is complete (all nodes are connected), the minimum hamiltonian path problem may not completely apply.

In any case, the answer below uses brute force. It permutes all possible paths, calculates the distance of each path, and then gets the minimum.

import itertools as it
import math

def dist(x,y):
    return math.hypot(y[0]-x[0],y[1]-x[1])

paths = [ p for p in it.permutations([(1,2),(2,3),(5,6),(3,4)]) ]
path_distances = [ sum(map(lambda x: dist(x[0],x[1]),zip(p[:-1],p[1:]))) for p in paths ]
min_index = argmin(path_distances)

print paths[min_index], path_distances[min_index]

Output:

((1, 2), (2, 3), (3, 4), (5, 6)) 5.65685424949

Note that the reverse path is an equivalent minimum

egafni
  • 1,982
  • 1
  • 16
  • 11
2

The other answer here is correct that this is some class of NP problem. If you really need 1000 nodes, there's no way you're ever going to truly solve it. But does it need to be exact? If not, perhaps you can try to just pick a random point and walk from there to the nearest point each time? It's not guaranteed to give you the shortest path, but perhaps it's close enough. E.g.:

data [ (1,2), (3,4), ... ]

cur = 0
path = [cur]
totalDist = 0
for i in range(1,len(data)):
    dists = [(dist(data[i],p), pi) for (pi,p) in enumerate(data) if pi != i and pi not in path]
    nextDist, cur = min(dists)
    totalDist += nextDist
    path.append(cur)

print path, totalDist

This is O(n^2) in distance computations and comparisons, and only O(n) in memory, which is at least achievable for 1000 points.

JoshG79
  • 1,685
  • 11
  • 14