4

Very similar to this question

Selecting an integer from each of the given n sets of integers such that the sum of their pairwise distances is minimized

except instead of integers we have multidimensional feature vectors with an arbitrary distance function for measuring their similarity.

The person who posted that question claimed that they had found a polynomial time solution to their problem that would appear to also work for my problem, but I couldn't understand their answer and it wasn't clear if it was actually correct.

My problem is also equivalent to this question with the weights of the graph equal to the similarity function's outputs.

max-weight k-clique in a complete k-partite graph

But it's unclear if the k-partiteness makes this problem any easier than the general max-weight clique problem.

edit:

From David Eisenstat's answer, it looks the original question isn't solvable in polynomial time. But does anything change if the objective is minimize the maximum pairwise distance between the selected points?

Community
  • 1
  • 1
gokva
  • 47
  • 4
  • Perhaps I'm not understanding the question exactly correctly, but wouldn't this be solvable in a two-pass linear algorithm? 1) one pass over all points in all sets to calculate a centroid of all the points, followed by 2) for each set, iterate over all points in the set to find the point with the minimum distance to the centroid. It seems that would work at least in the 2-dimensional case. Maybe not so much in 3-dimensions and beyond - that's a bit harder to visualize... – twalberg Apr 08 '15 at 17:35
  • Consider a case where the first elements of each set are identical. So the best solution is to choose the first element for each set. But the sets include a bunch of other points which drag the centroid far away from this solution, so the points closest to the centroid would not be the best solution. – gokva Apr 08 '15 at 17:58

1 Answers1

4

The other answer must be using the particular properties of the integers, because this version is NP-hard. Given k and an undirected graph G = (V, E), we can detect whether G has a k-clique by letting there be points V × {1, …, n} and setting d((v, i), (w, j)) = 1 if {v, w} ∈ E and i ≠ j, with d((v, i), (w, j)) = 2 otherwise. Each set consists of the points V × {i} for some i.

Evgeny Kluev's local search algorithm may be a reasonable thing to try. Initialize a solution with random points, then do the following repeatedly. For some random set (or collection of sets), determine the optimal solution subject to the constraint that the points from other sets stay the same. Retry the whole computation a few times to avoid local minima.

David Eisenstat
  • 64,237
  • 7
  • 60
  • 120
  • OK thanks, that makes sense. Do you think anything changes if the distance function is euclidean distance? – gokva Apr 08 '15 at 16:57
  • @gokva I'd have to look up the details of the metric embedding, but probably nothing much changes. How many dimensions are there? There might be a practical approximation scheme. – David Eisenstat Apr 08 '15 at 17:00
  • In my setting, the "points" are 2d polygons which may have different numbers of vertices. To compare them I'd probably represent the polygons with their distance function's values on a 2D grid of fixed size. So the dimension of the "points" would be the number of grid points, which would probably be a lot. Bummer. – gokva Apr 08 '15 at 17:06
  • Oh another thing, do you think we can do better if the objective is minimize the maximum pairwise distance instead of minimize the sum? – gokva Apr 08 '15 at 17:09
  • @gokva No, the reduction really only needs the objective to detect whether we got 1s everywhere. – David Eisenstat Apr 08 '15 at 17:17