0

I have a large dataset that is a list of 1500 pairwise distance matrices created with the dist command in R. I need to get the nearest neighbor for each 'individual' in each of the 1500 matrices (which differ in the number of individuals they contain), but am having problems. I've found other links Computing sparse pairwise distance matrix in R that will supposedly do this for a distance matrix but the first issue is that it seems to leave one individual out of every matrix. If there are 6 individuals, it only returns nearest neighbors for the first 5. The other issue is that it doesn't return the same values that are in the original distance matrix (as it shows in the link) but rather is changing the values. Are there any newer packages or commands that can do this? Or does anyone know of a trick to be able to do this? Thanks!

Here is an example matrix and desired output

      a  b  c   d   e   
b   1.5                 
c   1.3 2.3             
d   2.2 2.1 3.1         
e   2.4 1.4 1.6 2.2     
f   3.2 1.6 2.7 3.1 1.5 

desired output

a   1.3                 
b   1.4                 
c   1.3                 
d   2.1                 
e   1.4                 
f   1.5                 

EDITED to show the loop I'm trying to use with digEmAll's suggestion, which works on a single matrix. dists is a list of distance matrices already computed that I need the nearest neighbors from.

nearest<-list()
tempd<-list()
runndist<- for (i in 1:1561) {
tempd[[paste(i)]]<-as.matrix(dists[[i]])
nearest[[paste(i)]]<-diag(tempd[[i]]) <-NA;apply(tempd[[i]],1,min,na.rm=TRUE)}

EDIT Got the loop to work and this now gives the nearest neighbor distance for all matrices in the list. I'm sure this can be done much more elegantly, but it worked for what I need.

 nearest<-list()
tempd<-list()
tempd2<-list()
runndist<- for (i in 1:1561) {
tempd[[paste(i)]]<-as.matrix(dists[[i]])
tempd2[[paste(i)]]<-diag(tempd[[i]])<-NA

tryCatch({
    nearest[[paste(i)]]<-apply(tempd[[i]],1,min,na.rm=TRUE)
    if (i==7) stop("could not calculate")
}, error=function(e){cat("ERROR :",conditionMessage(e), "\n")})
}
Community
  • 1
  • 1
zc1
  • 13
  • 4
  • you can obtain the full distance matrix for a distance matrix using `as.matrix(distMx)`. So to obtain the minimums for a single matrix I would do : `m<-as.matrix(distMx);diag(m)<-NA;apply(m,1,min,na.rm=TRUE)` – digEmAll Jan 20 '17 at 20:32
  • It looks like this works when applied to a single matrix within the list of 1500, but I try to loop it, it just returns a single NA for every matrix. nearest<-list() tempd<-list() runndist<- for (i in 1:1561) { tempd[[paste(i)]]<-as.matrix(dists[[i]]) nearest[[paste(i)]]<-diag(tempd[[i]])<-NA;apply(tempd[[i]],1,min,na.rm=TRUE) } – zc1 Jan 20 '17 at 20:47
  • Sorry, not the clearest representation of my code. Not sure how to format it within an comment.... – zc1 Jan 20 '17 at 20:49
  • Okay, think I got it to work! Thanks digEmAll! Posting the final loop in my main message above. – zc1 Jan 20 '17 at 21:25
  • yes, my solution was for one matrix (you provided just one afterall). And note that the `";"` in my example is used to separate the different commands (in R you can separate different commands in different lines or using `";"`) – digEmAll Jan 21 '17 at 08:46

0 Answers0