Calculating distance between all locations to first location, by group

Question

I have GPS locations from several seabird tracks, each starting from colony x. Therefore the individual tracks all have similar first locations. For each track, I would like to calculate the beeline distance between each GPS location and either (a) a specified location that represents the location of colony x, or (b) the first GPS point of a given track which represents the location of colony x. For (b), I would look to use the first location of each new track ID (track_id).

I have looked for appropriate functions in geosphere, sp, raster, adehabitatLT, move, ... and just cannot seem to find what I am looking for.

I can calculate the distance between successive GPS points, but that is not what I need.

package(dplyr)
df %>%  
  group_by(ID) %>%
  mutate(lat_prev = lag(Lat,1), lon_prev = lag(Lon,1) ) %>%
  mutate(dist = distVincentyEllipsoid(matrix(c(lon_prev, lat_prev), ncol = 2), # or use distHaversine
                                      matrix(c(Lon, Lat), ncol = 2)))

#example data:

df <- data.frame(Lon = c(-96.8, -96.60861, -96.86875, -96.14351, -92.82518, -90.86053, -90.14208, -84.64081, -83.7, -82, -80, -88.52732, -94.46049,-94.30, -88.60, -80.50, -81.70, -83.90, -84.60, -90.10, -90.80, -92.70, -96.10, -96.55, -96.50, -96.00),
                 Lat = c(25.38657, 25.90644, 26.57339, 27.63348, 29.03572, 28.16380, 28.21235, 26.71302, 25.12554, 24.50031, 24.89052, 30.16034, 29.34550, 29.34550, 30.16034, 24.89052, 24.50031, 25.12554, 26.71302, 28.21235, 28.16380, 29.03572, 27.63348, 26.57339, 25.80000, 25.30000),
                 ID = c(rep("ID1", 13), rep("ID2", 13)))

Grateful for any pointers.

Welcome to SO! In order to get help from the community it will be best if you can post some sample data with your question and any code you've tried. Please see [this](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for more details on how to provide a great reproducible example of your issue. — Dan Adams, Jan 09 '22 at 23:05
Thanks, I have added some sample data. I have not tried any specific code because I have not found anything that seems to come close to what I need. I have several codes that calculate the distance between each successive GPS point, but that is not what I need. — user303287, Jan 09 '22 at 23:18
Thanks, that's a very helpful start. However it looks like your sample data is missing longitude. — Dan Adams, Jan 09 '22 at 23:58
sorry, I am not sure what happened there. Longitude data visible now — user303287, Jan 10 '22 at 00:16

score 2 · Answer 1 · answered Jan 10 '22 at 00:26

You were pretty close. The key is that you want to calcualte the distance from the first observation in each track. Therefore you need to first adorn with the order in each track (easy to do with dplyr::row_number()). Then for the distance calculation, make the reference observation always the first by subsetting with order == 1.

library(tidyverse)
library(geosphere)

df <- data.frame(Lon = c(-96.8, -96.60861, -96.86875, -96.14351, -92.82518, -90.86053, -90.14208, -84.64081, -83.7, -82, -80, -88.52732, -94.46049,-94.30, -88.60, -80.50, -81.70, -83.90, -84.60, -90.10, -90.80, -92.70, -96.10, -96.55, -96.50, -96.00),
                 Lat = c(25.38657, 25.90644, 26.57339, 27.63348, 29.03572, 28.16380, 28.21235, 26.71302, 25.12554, 24.50031, 24.89052, 30.16034, 29.34550, 29.34550, 30.16034, 24.89052, 24.50031, 25.12554, 26.71302, 28.21235, 28.16380, 29.03572, 27.63348, 26.57339, 25.80000, 25.30000),
                 ID = c(rep("ID1", 13), rep("ID2", 13)))
                 
                 
df %>%  
  group_by(ID) %>%
  mutate(order = row_number()) %>% 
  mutate(dist = distVincentyEllipsoid(matrix(c(Lon[order == 1], Lat[order == 1]), ncol = 2), 
                                      matrix(c(Lon, Lat), ncol = 2)))
#> # A tibble: 26 x 5
#> # Groups:   ID [2]
#>      Lon   Lat ID    order     dist
#>    <dbl> <dbl> <chr> <int>    <dbl>
#>  1 -96.8  25.4 ID1       1       0 
#>  2 -96.6  25.9 ID1       2   60714.
#>  3 -96.9  26.6 ID1       3  131665.
#>  4 -96.1  27.6 ID1       4  257404.
#>  5 -92.8  29.0 ID1       5  564320.
#>  6 -90.9  28.2 ID1       6  665898.
#>  7 -90.1  28.2 ID1       7  732131.
#>  8 -84.6  26.7 ID1       8 1225193.
#>  9 -83.7  25.1 ID1       9 1319482.
#> 10 -82    24.5 ID1      10 1497199.
#> # ... with 16 more rows

^{Created on 2022-01-09 by the reprex package (v2.0.1)}

This works perfectly! Thank you very much for your help - much appreciated. — user303287, Jan 10 '22 at 00:52
Glad it helped. If this answers your question, please [accept the answer](https://stackoverflow.com/help/someone-answers) to mark it closed. — Dan Adams, Jan 10 '22 at 01:13

user303287 · Answer 2 · 2022-01-10T00:53:47.747

1

This also seems to work (sent to me by a friend) - very similar to Dan's suggestion above, but slightly different

library(geosphere)
library(dplyr)

df %>% 
  group_by(ID) %>%
  mutate(Dist_to_col = distHaversine(c(Lon[1], Lat[1]),cbind(Lon,Lat)))

edited Jan 10 '22 at 00:53

answered Jan 10 '22 at 00:47

user303287

131
5

Calculating distance between all locations to first location, by group

2 Answers2