9

Given two sets of points in n-dimensional space, how can one map points from one set to the other, such that each point is only used once and the total euclidean distance between the pairs of points is minimized?

For example,

import matplotlib.pyplot as plt
import numpy as np

# create six points in 2d space; the first three belong to set "A" and the
# second three belong to set "B"
x = [1, 2, 3, 1.8, 1.9, 3.4]
y = [2, 3, 1, 2.6, 3.4, 0.4]

colors = ['red'] * 3 + ['blue'] * 3

plt.scatter(x, y, c=colors)
plt.show()

example of point distance minimization problem

So in the example above, the goal would be to map each red point to a blue point such that each blue point is only used once and the sum of the distances between points is minimized.

I came across this question which helps to solve the first part of the problem -- computing the distances between all pairs of points across sets using the scipy.spatial.distance.cdist() function.

From there, I could probably test every permutation of single elements from each row, and find the minimum.

The application I have in mind involves a fairly small number of datapoints in 3-dimensional space, so the brute force approach might be fine, but I thought I would check to see if anyone knows of a more efficient or elegant solution first.

Community
  • 1
  • 1
Keith Hughitt
  • 4,860
  • 5
  • 49
  • 54
  • So this question seems to be about the algorithm, that is language-agnostic? – moooeeeep Aug 18 '16 at 11:23
  • The two sets are always of equal size? – moooeeeep Aug 18 '16 at 11:24
  • 1
    Isn't this problem an instance of the [linear sum assignment](http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.linear_sum_assignment.html) problem? – Stelios Aug 18 '16 at 11:28
  • @moooeeeep Primarily interested in python answers, but if there is a simple algorithmic solution, I'd be interested in that as well; As far as the size, I'd be interested in both cases - one where the sets are the same size, and another where one set is larger than the other. Either or both solutions would be useful. – Keith Hughitt Aug 18 '16 at 11:43
  • @Stelios Thanks for pointing that out to me; I hadn't heard of it before. It looks like it is trying to solve the same problem. Do you think you could post an answer which applies it in the context of two sets of points? – Keith Hughitt Aug 18 '16 at 11:58

2 Answers2

14

An example of assigning (mapping) elements of one set to points to the elements of another set of points, such that the sum Euclidean distance is minimized.

import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial.distance import cdist
from scipy.optimize import linear_sum_assignment

np.random.seed(100)

points1 = np.array([(x, y) for x in np.linspace(-1,1,7) for y in np.linspace(-1,1,7)])
N = points1.shape[0]
points2 = 2*np.random.rand(N,2)-1

C = cdist(points1, points2)

_, assigment = linear_sum_assignment(C)

plt.plot(points1[:,0], points1[:,1],'bo', markersize = 10)
plt.plot(points2[:,0], points2[:,1],'rs',  markersize = 7)
for p in range(N):
    plt.plot([points1[p,0], points2[assigment[p],0]], [points1[p,1], points2[assigment[p],1]], 'k')
plt.xlim(-1.1,1.1)
plt.ylim(-1.1,1.1)
plt.axes().set_aspect('equal')

enter image description here

Stelios
  • 5,271
  • 1
  • 18
  • 32
  • 1
    Thanks! Marking this as solution since Stelios was the first to suggest the use of `scipy.optimize.linear_sum_assignment`, and for the detailed code example demonstrating application from start to finish. – Keith Hughitt Aug 18 '16 at 20:39
3

There's a known algorithm for this, The Hungarian Method For Assignment, which works in time O(n3).

In SciPy, you can find an implementation in scipy.optimize.linear_sum_assignment

Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
  • Look good! Just to be clear -- `linear_sum_assigment()` accepts a _cost_ matrix, which in this case would be the output from `scipy.spatial.distance.cdist()`, and not the original data points themselves, correct? – Keith Hughitt Aug 18 '16 at 12:15
  • 1
    @KeithHughitt Absolutely. You might want to look at the slides, BTW, they're quite readable. – Ami Tavory Aug 18 '16 at 12:17