0

I have a data as follow and I need to group them based on dates that time_right + 1 = time_left (in other rows). The group id is equal to the minimum id of those records that satisfy this condition.

input = data.frame(id = c(1:6),
                   time_left = c("2016-01-01", "2016-09-05", "2016-09-06","2016-09-08", "2016-09-12","2016-09-15"), 
                   time_right = c("2016-09-07", "2016-09-11", "2016-09-12", "2016-09-14", "2016-09-18","2016-09-21"))

Input

  id  time_left time_right
1  1 2016-01-01 2016-09-07
2  2 2016-09-05 2016-09-11
3  3 2016-09-06 2016-09-12
4  4 2016-09-08 2016-09-14
5  5 2016-09-12 2016-09-18
6  6 2016-09-15 2016-09-21

Output:

  id  time_left time_right group_id
1  1 2016-01-01 2016-09-07        1
2  2 2016-09-05 2016-09-11        2
3  3 2016-09-06 2016-09-12        3
4  4 2016-09-08 2016-09-14        1
5  5 2016-09-12 2016-09-18        2
6  6 2016-09-15 2016-09-21        1

Is there anyway to do it with dplyr?

Henrik
  • 65,555
  • 14
  • 143
  • 159
ADEN
  • 99
  • 4
  • Probably. Meanwhile, an `igraph` alternative: `components(graph_from_data_frame(data.frame(input$time_left, as.Date(input$time_right) + 1)))$membership[input$time_left]`. Related: [identify groups of linked episodes which chain together](https://stackoverflow.com/questions/12135971/identify-groups-of-linked-episodes-which-chain-together/12170710#12170710) – Henrik Sep 09 '22 at 18:20
  • 1
    You can use dplyr to join up the next step in the chain, but you should use Henrik's graph theory method to handle the arbitrary depth of the chains. – Michael Dewar Sep 11 '22 at 11:41

0 Answers0