I am trying to create a custom distribution function (based on the haversine). right now my prototype is a double for
loop. I've looked around on vectorized operations, and I am still very much learning (very new to R), so it is not clear how to clean this up. In the end I want an NxN matrix that compares the distance between points on the globe. Here is my test data for right now:
coord
Latitude Longitude
1 16.34577 6.303545
2 12.49475 28.626396
3 27.79462 60.032495
4 44.42699 110.114216
5 -69.85409 87.946878
the evil double for
-loop:
for (i in 1:dim(coord)[1]){
for(j in 1:dim(coord)[1]) # for each column {
mymat[i,j] = coord[i,1]*coord[j,2] # custom function for future
}
}
Result:
X1 X2 X3 X4 X5
1 103.03629 467.9204 981.2773 1799.902 1437.559
2 78.76122 357.6796 750.0910 1375.850 1098.874
3 175.20461 795.6596 1668.5801 3060.582 2444.450
4 280.04755 1271.7847 2667.0632 4892.043 3907.215
5 -440.32840 -1999.6708 -4193.5152 -7691.928 -6143.449
Of course, for 5 samples, no problem. But I have a list of 100k.
I did see a function after a search
custom.dist <- function(x, my.dist) {
mat <- sapply(x, function(x.1) sapply(x, function(x.2) my.dist(x.1, x.2)))
as.dist(mat)
}
But I don't understand what is going on and couldn't get it to work, even with a dummy function like x*y