0

I have a table:

ID    Latitude Longitude Visit_Date
1850  46.72187 -114.7254  7/1/2017
1850  46.72187 -114.7254  12/7/2018
1850  46.72187 -114.7254  6/13/2018
1850  46.72186 -114.7250  6/13/2018
1850  46.72186 -114.7250  6/8/2019
1850  46.72186 -114.7250  10/26/2019

I want to create a table that has a row for each unique lat long and the time frame that the ID was at that lat long. So ideally an output like this:

ID    Latitude Longitude   Start     End
1850  46.72187 -114.7254   7/1/2017  6/13/2018
1850  46.72186 -114.7250   6/13/2018 10/26/2019

I am a little lost on how to accomplish this. I was thinking using the shift() function to subtract rows from another and if the lat long != 0 then that would be a unique lat long. I was also thinking the unique() function would be useful but I'm not sure which direction would be best. Any help will be useful. Thank you!

  • Hi there, welcome to stackoverflow. It helps to put some sample data up in a reproducible format so that any suggestions definitely work for your data. You can use `dput()` on a sample section of your code and post the output here, or generate similarly structured random data. https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example for tips. Key thing a helper would need here would be to know how your dates are structured. Also, check out the `dplyr` package and `summarise()` - this may be the trick you need. – Andy Baxter Dec 11 '19 at 23:47

1 Answers1

1

Using dplyr, you can do summary statistics pretty easily:

df1 %>% 
    mutate(Visit_Date = as.Date(Visit_Date, format = "%m/%d/%Y")) %>% 
    group_by(ID, Latitude, Longitude) %>% 
    summarise(Start = min(Visit_Date), 
              End = max(Visit_Date))
Daniel V
  • 1,305
  • 7
  • 23
  • 1
    I think this is more correct. I didn't see the values for Latitude, Longitude changing – akrun Dec 11 '19 at 23:03