Time difference between dates of adjacent columns in R

Question

I'm trying to adapt the answer to my previous question (Difference between dates in many columns in R). I've realised I only want the time difference between a given column, and the column immediately to it's left. Example for clarification:

df <- data.frame(
  Group=c("A","B"),
  ID=c(1,2),
  Date1=as.POSIXct(c('2016-04-25 09:15:29','2016-04-25 09:15:29')),
  Date2=as.POSIXct(c('2016-04-25 14:01:19','2016-04-25 14:01:19')),
  Date3=as.POSIXct(c('2016-04-26 13:28:19','2016-04-26 13:28:19')),
  stringsAsFactors=F
)

My desired output is Date2-Date1 and Date3-Date2. And this of course would extend for many columns i.e. Date4-Date3 etc. But I do not need Date3-Date1. To clarify, how can I automate this for many columns

df$Date2_Date1 <- difftime(df$Date2,df$Date1, units = c("hours"))
df$Date3_Date2 <- difftime(df$Date3,df$Date2, units = c("hours"))

Thanks to @bgoldst for the original answer. I think I just need to adapt cmb below to have the correct sequence:

cmb <- combn(seq_len(ncol(df)-1L)+1L,2L);
res <- abs(apply(cmb,2L,function(x) difftime(df[[x[1L]]],df[[x[2L]]],units='hours')));
colnames(res) <- apply(cmb,2L,function(x,cns) paste0(cns[x[1L]],'_',cns[x[2L]]),names(df))

Thanks

score 2 · Accepted Answer · answered May 20 '16 at 08:31

Given your example, this should to the trick:

df <- data.frame(
  Group=c("A","B"),
  ID=c(1,2),
  Date1=as.POSIXct(c('2016-04-25 09:15:29','2016-04-25 09:15:29')),
  Date2=as.POSIXct(c('2016-04-25 14:01:19','2016-04-25 14:01:19')),
  Date3=as.POSIXct(c('2016-04-26 13:28:19','2016-04-26 13:28:19')),
  stringsAsFactors=F
)
mapply(difftime, df[, 4:5], df[, 3:4], units = "hours")

> Date2 Date3
> [1,] 4.763889 23.45
> [2,] 4.763889 23.45

In my call mapply applies function difftime to the two arrays provided, so it starts with df[, 4] - df[, 3], then df[, 5] - df[, 4]. You of course have to change this with the column numbers for your dates, and make sure they are ordered in the right way.

Good luck!

You can find the appropriate column with `which` and then calculate adjacent column by simply adding/subtracting up/down the data.frame. — Roman Luštrik, May 20 '16 at 08:58
Thanks Jasper, this worked very nicely. Roman, would you mind clarifying with a little bit of example code so I can try. — Pete900, May 20 '16 at 09:07

score 1 · Answer 2 · answered May 20 '16 at 08:28

You could use Non-Standard Evaluation:

First you create a character vector with the name of the columns containing the dates. So let' say all the columns starting with 'Date'
```
dates = names(df)[grepl("^Date", names(df))]
```

We create a list of formulas that dynamically calculate the difference between to adjacent columns:

all_operations = lapply(seq_len(length(dates) - 1), function(i){
    as.formula(paste("~difftime(", dates[i + 1], ",", dates[i],", units = c('hours'))"))
})

this will create the formulas:

[[1]]:  ~difftime(Date2, Date1, units = c("hours"))
[[2]]:  ~difftime(Date3, Date2, units = c("hours"))

Then you can use dplyr's NSE mutate_ to apply the dynamic formulas generated above:

df %>%
   mutate_(.dots = setNames(all_operations, paste0("Diff", seq_len(length(dates) - 1))))

Thanks Lorenzo. I wish I could accept two answers. This also worked very nicely! The only reason I accepted Jaspers is because I understood it more which is perhaps not a good reason. — Pete900, May 20 '16 at 09:13

Time difference between dates of adjacent columns in R

2 Answers2