I have a set of hospital admission and discharge dates, broken down by patient ID. There are multiple date ranges per ID and some of them overlap. I am trying to find a way to flag which rows contain overlapping dates, so that when I am calculating 'length of hospital stay' I do not double-count.
So far, I have created an interval variable (discharge date - admission date), and used int_overlaps to flag rows where there are overlaps. This has worked okay, but as well as flagging overlaps, it also flags consecutive stays.
i.e. I want to flag:
Stay A: 2001-10-03 / 2001-10-06
Stay B: 2001-10-04 / 2001-10-11
But I don't want to flag:
Stay A: 2001-10-03 / 2001-10-06
Stay B: 2001-10-06 / 2001-10-11
The code I used was copied from an answer elsewhere on this site, and I don't understand it enough to modify it in the right way (I am an almost total novice at R...!)
This is a simplified example of the df and code....if anyone can advise how I could change it to stop flagging the consecutive stays, I would super appreciate it!!!
ID <- c(1, 1, 2, 3, 3, 3, 4, 5, 5, 5, 5)
admdate <- c("2001-10-03", "2001-10-05", "2003-10-04", "2006-02-03", "2006-05-27", "2006-07-01", "2001-08-02", "2008-10-11", "2008-11-01", "2009-01-09", "2009-02-18")
dischdate <- c("2001-10-05", "2001-12-08", "2003-10-04", "2006-05-29", "2006-06-01", "2006-07-07", "2001-08-11", "2008-10-14", "2009-01-13", "2009-01-21", "2009-02-26")
HospAdms <- cbind(ID, admdate, dischdate)
HospAdms <- data.frame(ID, admdate, dischdate)
as_date(HospAdms$admdate)
as_date(HospAdms$dischdate)
HospAdms$Int <- interval(start=HospAdms$admdate, end=HospAdms$dischdate)
HospAdms$overlap <- unlist(tapply(HospAdms$Int,
HospAdms$ID,
function(x) rowSums(outer(x,x,int_overlaps))>1))
In the df that this example code produces, the top two lines are consecutive stays but they are flagged and I don't want them to be. Hope that makes sense!