1

Following the pointers from this question.

I'd like to calculate the cumulative time for all the Cats, by considering their respective last toggle status.

EDIT: I'd also want to check if the FIRST Toggle status of a Cat is Off and if it is so, for that specific cat, the time from midnight 00:00:00 till this first FIRST Off time should be added to its total conditional cumulative ontime.

Sample data:

       Time Cat Toggle
1  05:12:09  36 On
2  05:12:12 26R Off # First Toggle of this Cat happens to be Off, Condition met
3  05:12:15 26R On
4  05:12:16 26R Off
5  05:12:18  99 Off # Condition met
6  05:12:18  99 On
7  05:12:24  36 Off
8  05:12:26  36 On
9  05:12:29  80 Off # Condition met
10 05:12:30  99 Off
11 05:12:31  95 Off # Condition met
12 05:12:32  36 Off

Desired sample output:

  Cat Time(Secs)
1 36  21
2 26R 18733 # (=1+18732), 18732 secs to be added = total Sec from midnight till 05:12:12
3 99  18750 # (=12+18738), 18738 secs to be added = total Sec from midnight till 05:12:18
4 ..  ..

Any sort of help is appreciated.

3 Answers3

2

One can use as.difftime function to convert time from H:M:S format to seconds. Then for each On statue find the lead record in order to calculate interval of time lapsed from On.

library(dplyr)

# Convert Time in seconds.
df %>% mutate(Time = as.difftime(Time, units = "secs")) %>%
  group_by(Cat) %>%
  mutate(TimeInterVal = ifelse(Toggle == "On", (lead(Time) - Time), 0)) %>%
  summarise(TimeInterVal = sum(TimeInterVal))


# # A tibble: 5 x 2
#   Cat   TimeInterVal
#   <chr>        <dbl>
# 1 26R           1.00
# 2 36           21.0 
# 3 80            0   
# 4 95            0   
# 5 99           12.0 

Note: On can consider arranging data on Time ensure rows are ordered on time.

Data:

df <- read.table(text ="
Time Cat Toggle
1  05:12:09  36 On
2  05:12:12 26R Off
3  05:12:15 26R On
4  05:12:16 26R Off
5  05:12:18  99 Off
6  05:12:18  99 On
7  05:12:24  36 Off
8  05:12:26  36 On
9  05:12:29  80 Off
10 05:12:30  99 Off
11 05:12:31  95 Off
12 05:12:32  36 Off",
header = TRUE, stringsAsFactors = FALSE)
MKR
  • 19,739
  • 4
  • 23
  • 33
  • Thanks MKR for quick solution, +1. Till the point of `TimeInterVal` creation, there weren't issues although in `summarise` it produced `NA`s too often, tried handling those but to avail hard luck – Keyshov Borate Jul 01 '18 at 08:45
  • @KeyshovBorate Just change the `summarise` as `summarise(TimeInterVal = sum(TimeInterVal, na.rm = TRUE))` – MKR Jul 01 '18 at 09:03
2

using base R:

df$Time=as.POSIXct(df$Time,,"%H:%M:%S")

stack(by(df,df$Cat,function(x)sum(c(0,diff(x$Time))*(x$Toggle=="Off"))))

  values ind
1      1 26R
2     21  36
3      0  80
4      0  95
5     12  99
Onyambu
  • 67,392
  • 3
  • 24
  • 53
1

A possible solution using :

# load the 'data.table'-package, convert 'df' to a 'data.table'
# and 'Time'-column to a time-format
library(data.table)
setDT(df)[, Time := as.ITime(Time)]

# calculate the time-difference
df[, .(time.diff = sum((shift(Time, type = 'lead') - Time) * (Toggle == 'On'), na.rm = TRUE))
   , by = Cat]

which gives:

   Cat time.diff
1:  36        21
2: 26R         1
3:  99        12
4:  80         0
5:  95         0

In respons to your question in the comments, you could do:

# create a new data.table with midnigth times for the categories where
# the first 'Toggle' is on "Off"
df0 <- df[, .I[first(Toggle) == "Off"], by = Cat
          ][, .(Time = as.ITime("00:00:00"), Cat = unique(Cat), Toggle = "On")]

# bind that to the original data.table; order on 'Cat' and 'Time'
# and then do the same calculation
rbind(df, df0)[order(Cat, Time)
               ][, .(time.diff = sum((shift(Time, type = 'lead') - Time) * (Toggle == 'On'), na.rm = TRUE))
                                 , by = Cat]

which gives:

   Cat time.diff
1: 26R     18733
2:  36        21
3:  80     18749
4:  95     18751
5:  99     18750

An alternative with base R (only original question):

df$Time <- as.POSIXct(df$Time, format = "%H:%M:%S")

stack(sapply(split(df, df$Cat),
             function(x) sum(diff(x[["Time"]]) * (head(x[["Toggle"]],-1) == 'On'))))

which gives:

  values ind
1      1 26R
2     21  36
3      0  80
4      0  95
5     12  99

Or with the (only original question):

library(dplyr)
library(lubridate)

df %>% 
  mutate(Time = lubridate::hms(Time)) %>% 
  group_by(Cat) %>% 
  summarise(time.diff = sum(diff(Time) * (head(Toggle, -1) == 'On'),
                            na.rm = TRUE))
Jaap
  • 81,064
  • 34
  • 182
  • 193
  • Thanks Jaap. While creating `time.diff`, how could I check if the First `Toggle` for the `Cat` is `Off` and if so how could I add 'time from midnight to its First Off time'. – Keyshov Borate Jul 02 '18 at 12:47
  • 1
    @KeyshovBorate Do you mean that for `Cat == 26R` the first `Toggle` has to be set to `On` and the corresponding `Time` to zero? Could you include the desired output based on above example data? – Jaap Jul 02 '18 at 13:43
  • Sure let me add. No. It's basically adding the time from midnight for all the `Cat`s which has its Very First toggle as `Off` - Because that's going to be the first time it has been off since midnight - the time which we must consider as its total ontime. – Keyshov Borate Jul 02 '18 at 14:37
  • Great to know Jaap :) – Keyshov Borate Jul 02 '18 at 16:54