1

I am a casual user of R language, I have a dataset of 800000 coordinates (longitude, latitude) for postal codes in Canada.

I have attached a CSV file with a sample called question.csv.

TariffZone<-c("CAAL1","CAAL2","CAAL3","CAAL3")
latitude<-c(51.15701,51.91826,50.80972,50.76358)
longitude<-c(-110.206392,-110.146722,-110.7003,-110.568045)
elevation<-c(300,400,450,450)
d<-data.frame(TariffZone,latitude,longitude,elevation)

I am trying to get a table where I would have all the distances in KM in between Tariff Zones (which average the coordinates of each postal codes)

enter image description here

I have seen a related question : R distance matrix build

Unfortunately I have not been able to solve my problem with the given answer, here is my code:

d<-read.csv("question.csv", header=TRUE)
head(d)

d2<-d %>% 
  group_by(TariffZone) %>%
  summarise(latitude_mean=mean(latitude),longitude_mean=mean(longitude),elevation_mean=mean(elevation)) %>%
  dplyr::mutate_if(is.numeric, format, 4) %>%
  ungroup()
head(d2)

d3<-data.frame(d2$longitude_mean,d2$latitude_mean)
head(d3)

distable<-dist(cbind(d2$longitude_mean,d2$latitude_mean),cbind(d2$longitude_mean,d2$latitude_mean),method=euclidean)

harvesine<-function(lat1,lat2){
  newVar<-acos(sin(lat1*3.14159265359/180)*sin(lat2*3.14159265359/180) + cos(lat1*3.14159265359/180)*cos(lat2*3.14159265359/180)*cos(lon2*3.14159265359/180-lon1*3.14159265359/180) ) * 6371000} 

Please do not hesitate if I forgot to add any piece of information or if I have to prescise my question.

Bonus: If there is a way I can use the harvesine formula I have in the code instead of euclidean and if I could export this table back to a csv file

  • 1
    If you use `geosphere::distm` instead of `dist`, it will by default use a very efficient approximation to Haversine distance, or you can specify the Haversine distance instead. – Gregor Thomas Dec 18 '19 at 22:10
  • 1
    Suggested duplicates: [Function to calculate geospatial distance between two points (lat,long) using R](https://stackoverflow.com/q/32363998/903061), or [Geographic / geospatial distance between 2 lists of lat/lon points (coordinates)](https://stackoverflow.com/q/31668163/903061) – Gregor Thomas Dec 18 '19 at 22:14
  • 1
    If you need more help, please share a little bit of sample data, e.g., dput(head(d3))`, that way we can see what your code actually outputs and have something to test new solutions on. – Gregor Thomas Dec 18 '19 at 22:15
  • Thank you Gregor for the quick comments, I have edited my post and I'll be looking into those alternative questions today. – stackprojects Dec 19 '19 at 14:26
  • So I have tried matrix<-distm(c(d2$latitude_mean, d2$longitude_mean), fun = distHaversine) but I am getting the error: Error in .pointsToMatrix(x) : Wrong length for a vector, should be 2 – stackprojects Dec 19 '19 at 21:40
  • You needed to use `cbind` instead of `c`, or better yet just use a subset. And put longitude first. – Gregor Thomas Dec 19 '19 at 21:52

1 Answers1

2

From ?distm

x longitude/latitude of point(s). Can be a vector of two numbers, a matrix of 2 columns (first one is longitude, second is latitude) or a SpatialPoints* object

library(geosphere)
distm(d[c("longitude", "latitude")])
#          [,1]      [,2]      [,3]      [,4]
# [1,]     0.00  84796.69  51919.29  50608.72
# [2,] 84796.69      0.00 129215.81 131775.10
# [3,] 51919.29 129215.81      0.00  10645.63
# [4,] 50608.72 131775.10  10645.63      0.00
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294