1

I'm looking at the effect the weekend may have on an individual's survival and so I'm trying to convert my data into a time-dependent structure with one row per interval. It will probably be a Cox PH model with DschDT (Discharge Date) as the censoring date. Either a patient is discharged alive (right-censored) or dies within hospital.

The data looks like this, where DIH is my censoring variable (0,1)

`structure(list(Age = c(28L, 77L, 92L, 28L, 59L, 7L), Sex = structure(c(1L, 
 2L, 1L, 1L, 2L, 2L), .Label = c("F", "M"), class = "factor"), 
Care.type = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Acute", 
"Organ.Procurement", "Geriatric.Eval.Mgt", "Psychogeriatric", 
"Maintenance", "Rehab", "Palliative"), class = "factor"), 
AdmDT = structure(c(1396282680, 1396311600, 1396329780, 1396331040, 
1396343940, 1396348080), class = c("POSIXct", "POSIXt"), tzone = ""), 
DschgDT = structure(c(1396288800, 1396335600, 1397721600, 
1396338600, 1396390200, 1396359120), class = c("POSIXct", 
"POSIXt"), tzone = ""), DIH = c(0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("Age", 
"Sex", "Care.type", "AdmDT", "DschgDT", "DIH"), row.names = c(1L, 
7L, 8L, 9L, 10L, 11L), class = "data.frame")`

For example, I might have a patient who is admitted to hospital on a Wednesday and is discharged alive the following Thursday. In this case there will be three rows for this patients episode. One for the Wednesday-Friday, one for Saturday-Sunday and one for Monday-Thursday, all inclusive.

I've managed to identify the weekends within a time period using this function.

  getDuration <- function(d1, d2,fmt="%Y-%m-%d %H%M") {  
                 myDays <- seq.Date(to   = as.Date(d2, format=fmt), 
                 from = as.Date(d1, format =fmt), 
                 by   = 1)
             myDays[which(is.weekend(myDays))]
             }

  dat<-mapply(getDuration,AdmDT,DschgDT)

> head(clip) ID StartDate EndDate Start Time Event WeekendStart1 WeekendEnd1 WeekendStart2 WeekendEnd2 1 1 9/08/2013 16/08/2013 0 7 0 1 3 0 0 2 2 9/12/2013 12/12/2013 0 3 0 0 0 0 0 3 3 9/01/2014 17/01/2014 0 8 1 2 4 0 0 After identifying where the weekends occur between dates I would like to split the time acording to weekends . For this example the resulting data would look like:

clip2 ID StartDate EndDate Start Time Event Weekend 1 1 9/08/2013 16/08/2013 0 1 0 0 2 1 9/08/2013 16/08/2013 1 3 0 1 3 1 9/08/2013 16/08/2013 3 7 0 0 4 2 9/12/2013 12/12/2013 0 3 0 0 5 3 9/01/2014 17/01/2014 0 2 0 0 6 3 9/01/2014 17/01/2014 2 4 0 1 7 3 9/01/2014 17/01/2014 4 8 1 0

However I can't seem to find a way to split the time intervals in an efficent way, survSplit and tmerge from the survival package don't seem to have functionality to do this. Can anyone give me some ideas, other than running a big ugly loop?

Update. Well I managed to do it after lots of head scratching. For those who are interested. This function finds what is considered a weekend by a hospital, ie. beginning on Friday night and ending early Monday morning. Of course you can edit to suit. This function returns Friday and Sunday, so that you can split on these days.

    is.weekend<-function (x) 
    {
     library(chron)
      if (!inherits(x, "dates")) 
    x<-as.chron(as.character(x))
    v <- month.day.year(x)
    h<-hours(x)
    out <- day.of.week(v$month, v$day, v$year) + 1
    # 1 is Sunday and 7 is Saturday, h is hours
  x<-((out == 6 & h >= 18) | out==7|out==1|(out == 2 & h < 6))
  return(x)
}

This is a simpler version of above to get the intervals

Basic function to identify the number of weekends starting on Saturday and ending on Sunday. d1 and d2 are admission and discharge date/times respectively.

getDuration <- function(d1, d2) {  
  myDays <- seq(d1,d2,by="hour")
  myDays[which(is.weekend(myDays))]
}

This function makes the time sequence for each record

survSeq.dh<-function(a,w){
  aa<-sort(c(a,as.POSIXct(w)))
  aa<-diff(aa)
  units(aa)<-"hours"
  aa<-as.numeric(aa)
  aa<-cumsum(aa)
  #Identify the start and end of weekends
  aa1<-which(diff(aa)!=1)
  aa1<-sort(c(aa1,aa1+1))
  aa1<-c(aa[1],aa[aa1],aa[length(aa)])/24
}

A bit of housekeeping

#Make a survSplit object 
#Create a start and stop time
dat$start<-0
dat$time<-as.numeric(dat$separation_datetime-dat$admission_datetime)/(60*24)
Event variable
dat$DIH<-dat$mode_of_separation=="Died in hospital"

The latest version of survival::survSplit creates a Surv object and this slows down the process considerably, so I use the old version.

New survSplit function in survival package 2.39-2 is too slow.

survSplit2<-function (data, cut, end, event, start, id = NULL, zero = 0, 
                      episode = NULL) 
{
  cut <- sort(cut)
  ntimes <- length(cut)
  n <- nrow(data)
  newdata <- lapply(data, rep, ntimes + 1)
  endtime <- rep(c(cut, Inf), each = n)
  eventtime <- newdata[[end]]
  if (start %in% names(data)) 
    starttime <- data[[start]]
  else starttime <- rep(zero, length.out = n)
  starttime <- c(starttime, pmax(starttime, rep(cut, each = n)))
  epi <- rep(0:ntimes, each = n)
  status <- ifelse(eventtime <= endtime & eventtime > starttime, 
                   newdata[[event]], 0)
  endtime <- pmin(endtime, eventtime)
  drop <- starttime >= endtime
  newdata <- do.call("data.frame", newdata)
  newdata[, start] <- starttime
  newdata[, end] <- endtime
  newdata[, event] <- status
  if (!is.null(id)) 
    newdata[, id] <- rep(rownames(data), ntimes + 1)
  if (!is.null(episode)) 
    newdata[, episode] <- epi
  newdata <- newdata[!drop, ]
  newdata
}

Then run in a script

Find weekend/after hours durations for each patient record

xx.s<-mapply(getDuration,dat$admission_datetime,dat$separation_datetime))

Define start and stop times for each weekend stay

xx.surv<-mapply(survSeq,dat$admission_datetime,xx.s)

Put the lot into a loop (sorry)

    lengthx<-dim(dat)[1]
    dat.l<-list()
   for(i in 1:lengthx){
      print(i)
    dat.l[[i]]<-survSplit2(dat[i,],cut=xx.surv[[i]],end="time",start="start",event="DIH")
    }
    library(data.table)
    dat.l<-data.frame(rbindlist(dat.l))

So now I have the basis of a way develop a time-dependent model that allows the person's hospital stay to switch between hazard functions as their stay alternates between the weekend and week day.

e.g. coxph(Surv(start,time,DIH)~DayOfWeek)

  • Can you talk a little bit more about what "split the time intervals" means? What is the input data, what is the mechanism and what is the expected result? Consider adding this information to your question. If you have problem coming up with an example, there's an [extensive thread](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) just for you. – Roman Luštrik Jun 20 '16 at 08:04
  • Thanks Roman, I've edited my question to reflect this. – user3602585 Jun 21 '16 at 22:06
  • I think it would be good to provide a reproducible example with this – Theodor Jun 24 '16 at 21:55
  • Thanks Theodor. I've addedd an example – user3602585 Jun 28 '16 at 04:19
  • Well I managed to do it after a bit of mucking around. For anyone else do variable time intervals. 'code'getDuration.dh <- function(d1, d2) { myDays <- seq(d1,d2,by="hour") myDays[which(is.weekend.dh(myDays))] } 'code' – user3602585 Aug 17 '16 at 02:24

0 Answers0