You can apply essentially the same idea as the algorithm I described in this question.
Let's look for the center of the final subset. It must minimize the maximum distance to each of the sets. As usual, the distance of a point to a set is defined as the minimum distance between the point and an element of the set.
For each set i
, the function fi
describing the distance to the set is piecewise linear. If a,b are two consecutive numbers, the relations fi(a) = 0, fi((a+b)/2) = (b-a)/2, fi(b) = 0
let us build a description of all the fi
in linear time.
But we can also compute the maximum of two piecewise functions fi
and fj
in linear time, by considering the consecutive intervals [a,b] where they are linear: either the result is linear, or it is piecewise linear by adding the unique intersection point of the functions to the partition. Since the slopes of our functions are always +1 or -1, the intersection point is a half-integer so it can be represented exactly in floating-point (or fixed-point) arithmetic.
A convexity argument shows that the maximum g
of all the fi
can only have at most twice as many points as the fi
, so we don't have to worry about the maximum having a number of points that would be exponential in the number of sets.
So we just:
- Compute the piecewise linear distance function
fi
for i = 1..p
.
- Compute the maximum
g
of all the fi
by repeatedly computing the maximum.
- The location of any minimum point of
g
is the desired center.
- For each set, pick the closest point to the center.
- The width of the set of points we picked is exactly the minimum of
g
:-)
Complexity is O(N) if the number of sets is bounded, or O(N p) if the number of sets p is variable. By being smart about how you compute the maximum (divide-and-conquer), I think you can even reduce it to O(N log p).