10

I would like to calculate time differences (delta time) in R. The timestamps are stored in a two-column data frame with time as date-time (year-month-day hour:min: sec.msec), e.g. for first three rows:

c_id    c_time
6875    2012-08-15 00:00:40.169
6874    2012-08-15 00:01:40.055
6876    2012-08-15 00:02:40.542

I would like to have an output a column with differences e.g.

c_diff
0
00:01:0.886
00:01:0.487

Can someone please tell me how to do this? If you have other/better suggestion how to keep the result, it will be greatly appreciated Thanks so much in advance! Mishu

Ista
  • 10,139
  • 2
  • 37
  • 38
Mishu
  • 103
  • 1
  • 5
  • I get "unexpected numeric constant" error with diff(c_time). Also, how would I perform subsequent differences through all rows? thanks! – Mishu Mar 19 '13 at 17:16
  • Paste the output from `dput( yourdataframe )` into the question above so we can use your actual data. – Simon O'Hanlon Mar 19 '13 at 17:19
  • 2012-09-06 11:41:38.978, "2012-09-06 11:42:38.514", "2012-09-06 11:43:39.001", 2012-09-06 11:44:38.656, "2012-09-06 11:45:38.923", "2012-09-06 11:46:38.999", 2012-09-06 11:47:38.375, "2012-09-06 11:48:38.091", "2012-09-06 11:49:37.646", 2012-09-06 11:50:37.272), class = "factor")), .Names = c("c_id", c_timestamp), class = "data.frame", row.names = c(NA, -32487L)) – Mishu Mar 19 '13 at 17:29
  • The above are few last rows of the output from dput(mydataframe)..Thanks! – Mishu Mar 19 '13 at 17:30

3 Answers3

7

Try this (I am assuming that you have your data in a data.frame called mydf) and that you want the difference between the first time stamp and all subsequent timestamps:

c_time <- as.POSIXlt( mydf$c_time )
difftime( c_time[1] , c_time[2:length(c_time)] )
  #Time differences in secs
  #[1]  -59.886 -120.373
  #attr(,"tzone")
  #[1] ""

Edit

But in case you want the delta difference between subsequent timestamps you need to reverse your obsevations (because the first way round you get time1 - time2 which will be negative), so you can just use instead:

c_time <- rev( c_time )
difftime(c_time[1:(length(c_time)-1)] , c_time[2:length(c_time)])
  #Time differences in secs
  #[1] 60.487 59.886
  #attr(,"tzone")
  #[1] ""
Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184
1

I will not give you the entire answer, but this will help you almost get there:

x="2012-07-11 04:22:40.169"
datex=strptime(x,format='%Y-%m-%d %H:%M:%S') #this converts your date string 
#into a date value recognized in r

y="2012-08-15 08:32:40.169"
datey=strptime(y,format='%Y-%m-%d %H:%M:%S')

time_diff=as.numeric(difftime(datey,datex)) #in decimal days
>35.17361

From decimal days you can then convert it back to whichever time format you want, but depending on what you want to do with it, you may want to keep it in a numeric form (perhaps in decimal hours, by multiplying time_diff by 24)...

Lucas Fortini
  • 2,420
  • 15
  • 26
0

One of the most versatile ways to create a new column that subtracts consecutive rows from another column is to combine dplyr::mutate and dplyr::lag.

df <- df %>%
  mutate(c_diff = c_time - lag(c_time,1))

The reason I say versatile is because this is not time specific and also works with any subtractable variable. For example, if you had position data along a highway, you could calculate change in kilometers or miles using the same code. And, if you use dplyr::group_by, you can do the task iteratively for different groups (e.g., restart the task for each trial or individual in a long dataset).

Tanner33
  • 120
  • 2
  • 15