I want to create a distance matrix between companies using their geographical locations.
I have a square distance matrix that contains the distances between 98 Italian provinces. I also have a a data frame with two columns. One column has the ID numbers for 8376 companies. The other column indicates in which of the 98 provinces that each one of these companies is located.
I want to create a 8376 by 8376 distance matrix that contains the distances between all the companies. The code I have written (below) is extremely inefficient. Is there anyway to do this more quickly? I'm asking because I need to this for multiple datasets.
Thus is what the data frame looks like
cid province
1 61 TO
2 102 TO
3 123 AT
4 127 TO
5 158 TO
6 225 NO
7 232 TO
8 388 TO
This is what the square distance matrix looks like
CH AQ PE TE
1 0 64.39 41.74 81.18
2 64.39 0 40.38 61.05
3 41.74 40.38 0 40.79
4 81.18 61.05 40.79 0
outcome = matrix(NA,8376,8376) # empty matrix
for(i in 1:8376){
for(j in (i+1):8376){
x=which(dist.codes[,1]==companyID_Province[i,2]) # Find the row index in the distance matrix
y=which(dist.codes[1,]==companyID_Province[j,2]) # Find the column index in the distance matrix
outcome[i,j] = dist.codes[x,y] # Specify the distance to the corresponding element in outcome matrix
}
}