1

I want to find the difference between two time stamps based on finding the time stamp in one column based on a condition, "Start", and then finding the time stamp for the first row that meets another condition in that same column, "Stop". Basically we used a program to "start" a behavior and "stop" a behavior so we could calculate the duration of the behavior.

I've tried adapting the code found in this post: subtract value from previous row by group

but I can't figure out how to get the lead to work on meeting a condition in upcoming rows for the same column. It's complicated by the fact that there can be "event" behaviors that have a "start" but no "stop". Example data frame.

Data
Behavior             Modifier_1           Time_relative_s              
BodyLength           Start                122.11      
Growl                Start                129.70
Body Length          Stop                 132.26      
Body Length          Start                157.79      
Body Length          Stop                 258.85      
Body Length          Start                270.12    
Bark                 Start                272.26
Growl                Start                275.68
Body Length          Stop                 295.37

and I want this:

Behavior             Modifier_1           Time_relative_s       diff             
BodyLength           Start                122.11                10.15
Growl                Start                129.70                 
Body Length          Stop                 132.26                
Body Length          Start                157.79                101.06  
Body Length          Stop                 258.85      
Body Length          Start                270.12                25.25    
Bark                 Start                272.26
Growl                Start                275.68
Body Length          Stop                 295.37

I've tried using dplyr pipes:

test<-u%>%
    filter(Modifier_1 %in% c("Start","Stop")) %>%
    arrange(Time_Relative_s) %>%
    mutate(diff = lead(Time_Relative_s, default = first(Time_Relative_s=="Stop")-Time-Relative_s)

But I must not be using lead right because this just returns the Time_Relative_s for me in the diff column. Any suggestions? Thanks for the help!

  • Could you please provide your date frame or some [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? – bbiasi May 02 '19 at 00:28
  • 1
    The example data was kindly provided by @akrun below. I don't really have a reproducible example of things going wrong other than the code I put above on what I've tried (but just doesn't return anything. I guess I was hoping to get some suggestions from people (using lag/lead or anything else). – Meghan Martin May 02 '19 at 15:37

1 Answers1

2

We may need to create a grouping variable based on the occurrence of 'stop' and then get the difference of 'Time_relative_s' that corresponds to the location of first 'Start', 'Stop' values in 'Modifier_1'

library(dplyr)
df1 %>% 
   group_by(grp = cumsum(lag(Modifier_1 == "Stop", default = FALSE))) %>% 
   mutate(diff = Time_relative_s[match("Stop", Modifier_1)] - 
                  Time_relative_s[match("Start", Modifier_1)], 
          diff = replace(diff, row_number() > 1, NA_real_)) %>%
   ungroup %>%
   select(-grp)
# A tibble: 9 x 4
#  Behavior    Modifier_1 Time_relative_s  diff
#  <chr>       <chr>                <dbl> <dbl>
#1 BodyLength  Start                 122.  10.1
#2 Growl       Start                 130.  NA  
#3 Body Length Stop                  132.  NA  
#4 Body Length Start                 158. 101. 
#5 Body Length Stop                  259.  NA  
#6 Body Length Start                 270.  25.2
#7 Bark        Start                 272.  NA  
#8 Growl       Start                 276.  NA  
#9 Body Length Stop                  295.  NA  

data

df1 <- structure(list(Behavior = c("BodyLength", "Growl", "Body Length", 
"Body Length", "Body Length", "Body Length", "Bark", "Growl", 
"Body Length"), Modifier_1 = c("Start", "Start", "Stop", "Start", 
"Stop", "Start", "Start", "Start", "Stop"), Time_relative_s = c(122.11, 
129.7, 132.26, 157.79, 258.85, 270.12, 272.26, 275.68, 295.37
)), row.names = c(NA, -9L), class = "data.frame")
akrun
  • 874,273
  • 37
  • 540
  • 662