1

Given a dataset

enter image description here

structure(list(intervention = c("Self Isolation", "Lockdown Low", 
"Lockdown Low", "Self Isolation", "Social Distancing", "Lockdown Low", 
"Social Distancing", "Handwashing"), date_start = structure(c(17897, 
17957, 18444, 17987, 17897, 17532, 17942, 18018), class = "Date"), 
    date_end = structure(c(17956, 18262, 18475, 18017, 17956, 
    18053, 18017, 18048), class = "Date")), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -8L))

How can I check if any "intervention" has overlapping dates? In this example, all interventions are good but "Social Distancing" and "Lockdown Low"

The ideal output would be a data frame with one intervention per line and a column filled with TRUE/FALSE depending if there is any overlapping for the intervention.

enter image description here

(Extra points for a tidyverse solution.)

tic-toc-choc
  • 815
  • 11
  • 26

1 Answers1

1

We can do a summarise

library(dplyr)
df1 %>%
    arrange(intervention, date_start, date_end) %>% 
    group_by(intervention) %>%
    summarise(overlapping = any(date_start < lag(date_end, 
         default = first(date_end)) & row_number() != 1))
# A tibble: 4 x 2
#  intervention      overlapping
#  <chr>             <lgl>      
#1 Handwashing       FALSE      
#2 Lockdown Low      TRUE       
#3 Self Isolation    FALSE      
#4 Social Distancing TRUE       
akrun
  • 874,273
  • 37
  • 540
  • 662