This code, based on Pascal's code, drops the point that has the largest row sum in the distance matrix.
m2 <- function(xy, n){
subset <- xy
alldist <- as.matrix(dist(subset))
while (nrow(subset) > n) {
cdists = rowSums(alldist)
closest <- which(cdists == min(cdists))[1]
subset <- subset[-closest,]
alldist <- alldist[-closest,-closest]
}
return(subset)
}
Run on a Gaussian cloud, where m1
is @pascal's function:
> set.seed(310366)
> xy <- cbind(rnorm(1000),rnorm(1000))
> m1s = m1(xy,20)
> m2s = m2(xy,20)
See who did best by looking at the sum of the interpoint distances:
> sum(dist(m1s))
[1] 646.0357
> sum(dist(m2s))
[1] 811.7975
Method 2 wins! And compare with a random sample of 20 points:
> sum(dist(xy[sample(1000,20),]))
[1] 349.3905
which does pretty poorly as expected.
So what's going on? Let's plot:
> plot(xy,asp=1)
> points(m2s,col="blue",pch=19)
> points(m1s,col="red",pch=19,cex=0.8)

Method 1 generates the red points, which are evenly spaced out over the space. Method 2 creates the blue points, which almost define the perimeter. I suspect the reason for this is easy to work out (and even easier in one dimension...).
Using a bimodal pattern of initial points also illustrates this:

and again method 2 produces much larger total sum distance than method 1, but both do better than random sampling:
> sum(dist(m1s2))
[1] 958.3518
> sum(dist(m2s2))
[1] 1206.439
> sum(dist(xy2[sample(1000,20),]))
[1] 574.34