2

I am trying to generate what I can best describe as a pairwise distance matrix from a data frame containing the distance between two neighboring points. These are not Euclidean distances, they are essentially the distance between points on a shoreline, so the distance is not a straight line. I was able to generate a distance matrix in the package riverdist using geospatial data, but that only did the complete distance between the two points, I am now trying to do a subset of this distance between the points.

I have been looking for a way to do this for a little while and keep coming up empty handed. Any help would be much appreciated.

Here is an exampe:

I have this data:

mat <- matrix(
 c(
     3, #distance between 1 and 2
    10, #distance between 2 and 3
     7, #distance between 3 and 4
     9  #distance between 4 and 5
),
nrow=4, ncol=1, dimnames =     list(c("site1","site2","site3","site4"),c("dist")))

> mat
  dist
site1    3
site2   10
site3    7
site4    9

And I would like to produce the following 'distance' matrix:

     site1 site2 site3 site4 site5
site1    0
site2    3     0     
site3   13    10     0   
site4   20    17     7     0
site5   29    26    16     9     0

The original data might be better organized as follows for this task:

     SiteA SiteB Dist
   1 site1 site2    3
   2 site2 site3   10
   3 site3 site4    7
   4 site4 site5    9

Any advice out there?

Aislin809
  • 23
  • 3

3 Answers3

2

I think you may be too focused on actual distances, while your problem is more a summing problem.

At least as I understand it, you don't need to look for the shortest route or a similar problem, but just need to add the numbers. So from a to b, means adding rows a up to b-1 from your variable mat. The only difficult thing is handling the case where from and to are backwards, or the same. Anyway, I get this:

dist <- function(a,b) abs(sum(if(a>1 && b>1) mat$dist[(a:b)-1][-1] else mat$dist[(a:b)-1]))
distmat <- sapply(1:5, function(i) {
  sapply(1:5, dist, i)
})
Emil Bode
  • 1,784
  • 8
  • 16
1

This will do it (albeit potentially a little slow):

dist = function(mat){
    tmp_mat = c(0, mat)
    dist_mat = diag(0, length(mat)+1)
    for (i in 1:length(mat))
        dist_mat[(i+1):(length(mat)+1),i] = cumsum(tmp_mat[-(1:i)])

    dist_mat = dist_mat + t(dist_mat)
    return (dist_mat)
    }
dist(mat)
mickey
  • 2,168
  • 2
  • 11
  • 20
1

This is a cumulative distance, so take a cumulative sum and then do the distance calculation:

mat <- c(3,10,7,9)
dist(cumsum(c(0,mat)))
#   1  2  3  4
#2  3         
#3 13 10      
#4 20 17  7   
#5 29 26 16  9
thelatemail
  • 91,185
  • 12
  • 128
  • 188
  • So dang simple. Thank you! I could not figure out what to google to find an answer, the word I needed was cumulative. Thanks again! – Aislin809 Dec 11 '18 at 02:09
  • @Aislin809 - no worries. If it solves your problem, hit the check mark next to the answer so other people know it is solved. – thelatemail Dec 11 '18 at 02:11