2

Let be an array of objects [a, b, c, d, ...] and a function distance(x, y) that gives a numeric value showing how 'different' are objects x and y.

I'm looking for an algorithm to find the subset of the array of length n that maximizes the minimum difference between that subset element.

Of course, I can't simply sort the array by the minimum of the differences with other elements and take the n highest entries, since removing an element can very well change the distances. For instance, if a=b, then removing a means the minimum distance of b with another element will change dramatically.

So far, the only solution I could find was wether to iteratively remove the element with the lowest minimum distance and re-calculate the distance at each iteration, until only n elements are left, or, vice-versa, to iteratively pick new elements, recalculate the distances, add the new pick or replace an existing one based on the distance minimums.

Does anybody know how I could get the same results without those iterations?

PS: here is an example, the matrix shows the 'distance' between each element...

   a   b   c   d
a  -   1   3   2
b  1   -   4   2
c  3   4   -   5
d  2   2   5   -

If we'd keep only 2 elements, that would be c and d; if we'd keep 3, that would be a or b, c and d.

user1545704
  • 541
  • 1
  • 4
  • 7
  • Wouldn't you always just want to take the furthest pair of elements? – templatetypedef Jan 16 '13 at 00:35
  • not necessarily, for example, if distance(b,d) were changed to 0, the pairs with the largest distance would suggest b,c,d, but that is not the optimal solution. – thang Jan 16 '13 at 00:42

2 Answers2

3

This problem is NP-hard, so no-one knows an efficient (polynomial time) algorithm for solving it.

Here's a quick sketch that it is NP-hard, by reduction from CLIQUE.

Suppose we have an instance of CLIQUE in the form of a graph G and a number n and we want to know whether there is a clique of size n in G. Construct a distance matrix d such that d(i, j) = 1 if vertices i and j are connected in G, or 0 if they are not. Then find a subset of the vertices of G of size n that maximizes the minimum distance between elements (your problem). If the minimum distance between vertices in this subset is 1, then G has a clique of size n; otherwise it does not.

Gareth Rees
  • 64,967
  • 9
  • 133
  • 163
  • would like to remark that no one knows a deterministic polynomial time algorithm for solving it. there exists non-deterministic polynomial time average algorithms. – thang Jan 16 '13 at 00:56
  • Ok thanks. I guess I'll stick to my 'brute force' algorithm, then. – user1545704 Jan 16 '13 at 00:56
1

As Gareth said this is an NP-hard problem, however there has been a lot of research into solving these kind of problems and as such better methods than brute force have been found. Unfortunately this is such a large area that you could spend forever looking at the possible implementations of a solutions.

However if you are interested in a heuristic way of solving this I would suggest looking into Ant Colony Optimization (ACO) which has proven fairly effective at finding optimum paths within graphs.