1

I have a dataset with 4 columns:

start_lat 41.90687, 41.94367, 41.93259, 41.89076

start_lng -87.62622, -87.64895, -87.63643, -87.63170

end_lat 41.90672, 41.98404, 41.93650, 41.91831

end_lng -87.63483, -87.66027, -87.64754

I want to add a 5th column named distance and I use the following formula, but I get the following error:

trips_202007 <- trips_202007 %>%
  rowwise()
  mutate(distance = distm(c(trips_202007["start_lng"], trips_202007["start_lat"]), c(trips_202007["end_lng"], trips_202007["end_lat"]), fun=distHaversine))

Error in .pointsToMatrix(x) : 'list' object cannot be coerced to type 'double'

Is there a way to avoid the error?

Thank you very much for the advice.

This is the file I got after dput(trips_202007, file="trips_202007.txt"): https://drive.google.com/file/d/1lnXeNqIRCDad0WotgoUQhrfPg0AXApLr/view?usp=sharing

  • can you provide the dataset through dput(trips_202007) ? – pbraeutigm Aug 16 '21 at 09:15
  • @pbraeutigm I have updated the post and included the file. – NIKOLAOS RAPANIS Aug 16 '21 at 10:42
  • Oh this dataset is really big. I thought of maybe 10 lines for testing. Can you update it with dput(head(trips_202007)) . I downloaded the txt but it is way to big for testing – pbraeutigm Aug 16 '21 at 11:18
  • @pbraeutigm here is the link. thanks! https://drive.google.com/file/d/1lnXeNqIRCDad0WotgoUQhrfPg0AXApLr/view?usp=sharing – NIKOLAOS RAPANIS Aug 16 '21 at 11:34
  • ok I got it. for your next post you could integrate this structure directly into your question with some data like in your second link. This helps to work on the problem. I got it now, working on an answer. – pbraeutigm Aug 16 '21 at 11:50
  • welcome to stackoverflow. If you find my answer satisfying, i would aprecciate if you mark it as "solved" , so I get the reputation :) – pbraeutigm Aug 16 '21 at 11:57

1 Answers1

0

You had the following Error:

Error in .pointsToMatrix(x) : 'list' object cannot be coerced to type 'double'

This is because your function is trying to calculate a matrix. You asked R if it would put the outcome into just one cell, and R doesn't want it that way. For this case I found the following function to be working:

    dt.haversine <- function(lat_from, lon_from, lat_to, lon_to, r = 6378137){
          radians <- pi/180
          lat_to <- lat_to * radians
          lat_from <- lat_from * radians
          lon_to <- lon_to * radians
          lon_from <- lon_from * radians
          dLat <- (lat_to - lat_from)
          dLon <- (lon_to - lon_from)
          a <- (sin(dLat/2)^2) + (cos(lat_from) * cos(lat_to)) * (sin(dLon/2)^2)
          return(2 * atan2(sqrt(a), sqrt(1 - a)) * r)
}

read that function first I found it here, Then start our calculation again with this code:

trips_202007 <- trips_202007 %>%
          mutate(distance = dt.haversine(start_lat, start_lng, end_lat, end_lng))
trips_202007
pbraeutigm
  • 455
  • 4
  • 8
  • Thank you for the solution. The only problem is that I when I cross-check the first result with an online calculator, the result is different. ONLINE: https://drive.google.com/file/d/1H7L3jKVw0LGXUwQIJ_UEDGn19POOqPNB/view?usp=sharing Rstudio: https://drive.google.com/file/d/1RYqLl8KyA904ABy06kE3Qjhe8ogMD6Yl/view?usp=sharing – NIKOLAOS RAPANIS Aug 16 '21 at 12:36
  • I checked the first 3 via https://www.nhc.noaa.gov/gccalc.shtml and it was correct – pbraeutigm Aug 16 '21 at 12:42
  • ok! I have already marked your answer as a solution. Thanks very much for the valuable help. – NIKOLAOS RAPANIS Aug 16 '21 at 12:50