0

I am applying a custom function over all combinations from two data sets. Is there a more efficient way to accomplish this?

ptm <- proc.time()
for (i in 1:nrow(x)) {
  d <- list()
  for (j in 1:nrow(centroids)) {
    d[j] <- f_dist(x[i,],centroids[j,])
  }
  x[i,]$cluster <- which.min(d) 
  x[i,]$dist_from_centroid <- d[which.min(d)]
}
proc.time()-ptm
Jeff Coughlin
  • 263
  • 3
  • 8
  • Please provide the code for user defined function – Jacob H Apr 06 '16 at 22:20
  • Probably? Give a sample of your data, a more detailed question, a desired output, what you've tried and then we may be able to help you out. See http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example. – Badger Apr 06 '16 at 22:21
  • Tip#1 Preallocate `d`, i.e. instead of `d <- list()` do something like `d <-rep(list(NA),nrow(centroids)` – Jacob H Apr 06 '16 at 22:23
  • Are you trying to conduct K-means or some similar clustering heuristic? If this is not a pedagogical exercise then you are much better using Base R functions or a clustering packages. This approach will be much faster because they will likely interface with C/C++. – Jacob H Apr 06 '16 at 22:29

0 Answers0