0

I wonder if there is an easy way to calculate the duration. I have a dataset where the a parameter, called m, varies between the values -1 and 1 during time. I want to calculate:

  1. The total duration (time in hours) of cases where m=-1 and m=1 respectively
  2. How long is each period of cases where m=-1 and m=1 respectively is

    m<-c(1,1,1,1,-1,-1,-1,-1,1,1,1,1,1,-1,-1,-1,-1,-1,1,1,1,1,1,1,1)

    Time <- seq.POSIXt(as.POSIXct(Sys.Date()), as.POSIXct(Sys.Date()+1), by = "1 hour")

user4631839
  • 83
  • 1
  • 2
  • 9
  • It is definitely possible. Can you give a sample of your data so we can help you with that? – Dominic Comtois Mar 12 '15 at 08:51
  • Have you converted your Time variable to proper format e.g. `df$Time <- as.POSIXct(df$Time)` ? When it's done - you can simply subtract time variables to get difftime. – statespace Mar 12 '15 at 08:52
  • Add [reproducible sample data](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) to your question. – zx8754 Mar 12 '15 at 09:01
  • I´ve tried to add a sample, is it OK the way I did it? So much to learn... – user4631839 Mar 12 '15 at 23:43

1 Answers1

2

I'd use package data.table for "split-apply-combine" and identify the runs using cumsum and diff:

DF <- read.table(text="Time,    m
2015-01-01 00:00,    -1
2015-01-01 01:00,    -1
2015-01-01 02:00,    -1
2015-01-01 03:00,    1
2015-01-01 04:00,    1
2015-01-01 05:00,    1
2015-01-01 06:00,    1
2015-01-01 07:00,    1
2015-01-01 08:00,    -1
2015-01-01 09:00,    -1
2015-01-01 10:00,    -1
2015-01-01 11:00,    -1
2015-01-01 12:00,    1
2015-01-01 13:00,    1
2015-01-01 14:00,    1
2015-01-01 15:00,    -1", header = TRUE, sep =",")

library(data.table)
setDT(DF)
DF[, Time := as.POSIXct(Time, format = "%Y-%m-%d %H:%M", tz = "GMT")]
DF[, run := cumsum(c(1, diff(m) != 0))]

DF1 <- DF[, list(m = unique(m), 
                 duration = difftime(max(Time), min(Time), unit = "min")), 
          by = run]
#   run  m duration
#1:   1 -1 120 mins
#2:   2  1 240 mins
#3:   3 -1 180 mins
#4:   4  1 120 mins
#5:   5 -1   0 mins

DF1[, sum(duration), by = m]
#    m  V1
#1: -1 300
#2:  1 360
Roland
  • 127,288
  • 10
  • 191
  • 288
  • Great! It took me sometime to understand this. Now I have studied more about the data.table package and I Think I understand. However, I would also compute the cumulative time of m=1 and m=-1 respectively, however I can´t figure out how to do. Do you have any suggestions? – user4631839 Apr 30 '15 at 19:38