I have a dataset that looks like this:
id land datetime
pb1 0 2004-04-05 01:44:00
pb1 1 2004-04-05 02:00:00
pb1 1 2004-04-05 16:00:00
pb2 1 2004-04-05 18:01:00
pb2 1 2004-04-05 20:00:00
library(data.table)
DT = data.table(
id = c("pb1", "pb1", "pb1", "pb2", "pb2"),
land = c(0L, 1L, 1L, 1L, 1L),
datetime = sprintf("2004-04-05 %02d:%02d:00",
c(1, 2, 16, 18, 20),
c(44, 0, 0, 1, 0))
)
I would like to make a column that cumulatively adds time (in days) but ONLY if there is a '1' in the land
column. I also would like the count to reset when the id
changes.
I have tried a variety of methods using data.table
, rleid
, and even a nested for
loop with no success. I have gotten errors using code like this:
DT[, total :=land*diff(as.numeric(datetime)), .(id, rleid(land))]
I have tried variations of the solution here: Calculating cumulative time in R
I'm not sure the best way to calculate the time interval (no success with difftime
or lubridate
).
I want the end result to look like this:
id land datetime cumtime.land
pb1 0 2004-04-05 01:44:00 0
pb1 1 2004-04-05 02:00:00 0
pb1 1 2004-04-06 16:00:00 1.58333
pb2 1 2004-04-05 18:00:00 0
pb2 1 2004-04-05 20:00:00 0.08333