0

In R, I have a data frame with a time column that I want to revert to one or zero every time the text in another column changes. The text never goes back to the previous text. There are 395 times the text changes and ~270'000 rows.

Table at the moment

Time Trial
11 A
12 A
13 B
14 B

Table wanted

Time Trial
1 A
2 A
1 B
2 B
Dylan Egan
  • 11
  • 3

2 Answers2

2

1) A base solution would be:

transform(DF, Time = ave(Time, Trial, FUN = seq_along))

2) Another base solution would be:

transform(DF, Time = 1:nrow(DF) - match(Trial, Trial) + 1)

3) dplyr With dplyr we can write:

library(dplyr)
DF %>%
  group_by(Trial) %>%
  mutate(Time = 1:n()) %>%
  ungroup

Benchmark

The base solutions are much faster on the data in the question but suggest you repeat this with your data or a subset of it.

library(dplyr)
library(microbenchmark)

microbenchmark(
  base1 = transform(DF, Time = ave(Time, Trial, FUN = seq_along)),
  base2 = transform(DF, Time = 1:nrow(DF) - match(Trial, Trial) + 1),
  dplyr = DF %>% group_by(Trial) %>% mutate(Time = 1:n()) %>% ungroup
)
## Unit: microseconds
##   expr     min      lq      mean   median       uq      max neval cld
##  base1   555.8   578.5   654.702   626.95   692.95   1345.7   100  a 
##  base2   308.5   330.2   415.610   391.75   410.30    950.6   100  a 
##  dplyr 11076.4 11354.9 13846.736 11543.80 11751.65 101861.9   100   b

Note

Input in reproducible form.

DF <- structure(list(Time = 11:14, Trial = c("A", "A", "B", "B")), 
  class = "data.frame", row.names = c(NA, -4L))
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
0

We could group and then use row_number()

library(dplyr)

df %>% 
  group_by(Trial) %>% 
  mutate(Time = row_number()) %>%
  ungroup()
   Time Trial
  <int> <chr>
1     1 A    
2     2 A    
3     1 B    
4     2 B  
TarJae
  • 72,363
  • 6
  • 19
  • 66