Detect overlapping dates by group with R

Question

Given a dataset

structure(list(intervention = c("Self Isolation", "Lockdown Low", 
"Lockdown Low", "Self Isolation", "Social Distancing", "Lockdown Low", 
"Social Distancing", "Handwashing"), date_start = structure(c(17897, 
17957, 18444, 17987, 17897, 17532, 17942, 18018), class = "Date"), 
    date_end = structure(c(17956, 18262, 18475, 18017, 17956, 
    18053, 18017, 18048), class = "Date")), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -8L))

How can I check if any "intervention" has overlapping dates? In this example, all interventions are good but "Social Distancing" and "Lockdown Low"

The ideal output would be a data frame with one intervention per line and a column filled with TRUE/FALSE depending if there is any overlapping for the intervention.

(Extra points for a tidyverse solution.)

what is your expected output. Is it summarised output or creating a column — akrun, Apr 20 '20 at 22:33
the ideal output would be a data frame with one intervention per line and a column filled with TRUE/FALSE depending if there is any overlapping for the intervention. — tic-toc-choc, Apr 20 '20 at 22:35
Not clear about the values, `df1 %>% group_by(intervention) %>% summarise(flag = any(date_start >= lag(date_end, default = first(date_end))))` — akrun, Apr 20 '20 at 22:38
should I order by `intervention` and `date_start`/`date_end` for this to work? — tic-toc-choc, Apr 20 '20 at 22:40
if you can update with the exact output you needed, it would be useful for crosscheckiing — akrun, Apr 20 '20 at 22:41

akrun · Accepted Answer · 2020-04-21T22:56:02.700

1

We can do a summarise

library(dplyr)
df1 %>%
    arrange(intervention, date_start, date_end) %>% 
    group_by(intervention) %>%
    summarise(overlapping = any(date_start < lag(date_end, 
         default = first(date_end)) & row_number() != 1))
# A tibble: 4 x 2
#  intervention      overlapping
#  <chr>             <lgl>      
#1 Handwashing       FALSE      
#2 Lockdown Low      TRUE       
#3 Self Isolation    FALSE      
#4 Social Distancing TRUE

edited Apr 21 '20 at 22:56

answered Apr 20 '20 at 22:59

akrun

874,273
37
540
662

Doesn't detect `Lockdown Low` overlapping if I add at the end of the dataframe `Lockdown Low, 01/01/2018, 02/01/2019` – tic-toc-choc Apr 21 '20 at 00:44
How can we fix that? – tic-toc-choc Apr 21 '20 at 21:12
@tic-toc-choc sorry, forgot to check this. Do you have an updated dataset – akrun Apr 21 '20 at 21:13
@tic-toc-choc can you please update your post with the new data so that i can test – akrun Apr 21 '20 at 21:14

Detect overlapping dates by group with R

1 Answers1

Linked