0

I have a dataframe from a psychology experiment with the time since the beginning of the experiment for each subject, and what I want is to set from that the time since the beginning of each trial for each subject. To do so I'm basically just substracting the minimum time value for each trial/subject to all the values for that same trial/subject.

I'm currently doing it with two for loops, I was just wondering if there's a way to vectorise it. What I have at the minute:

for (s in 1:max(df$Subject)){
  subject <- df[df$Subject==s,]
  for (t in 1:max(subject$TrialId)){
    trial <- subject[subject$TrialId==t,]
    start_offset <- min(trial$timestamp)
    df$timestamp[df$Subject==s & df$TrialId==t] <- df$timestamp[df$Subject==s &
                                                                df$TrialId==t]
                                                     - start_offset
  }
}

And what I would like is something like

df$timestamp <- df$timestamp - min_per_trial_per_subject(df$timestamp)
Arthur Spoon
  • 442
  • 5
  • 18

1 Answers1

3

With dplyr

library(dplyr)
df %>% group_by(Subject, TrialId) %>%
  mutate(modified_timestamp = timestamp - min(timestamp))

Should work. If it doesn't, please share a reproducible example so we can test.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • Works just fine! Although you still need to put `df <-` at the beginning of your line (not familiar with the `%>%` notation so didn't know that). Thanks a lot! – Arthur Spoon Nov 08 '17 at 16:17
  • 3
    I don't like to make assumptions about assignment - you might prefer `df_modified <- ...` – Gregor Thomas Nov 08 '17 at 16:20