1

I have a simple dataset with an id variable and date variable, and would like to create a counter variable (counter) that increments whenever date changes within the id variable. Assume the data is sorted by id and date, and that a specific date may appear any number of times within an id. This is very easily done in other languages (SAS with retain or Stata with by: and _n/_N), but I haven't found a very efficient way in R.

Final data: enter image description here

Drew
  • 24,851
  • 10
  • 43
  • 78
C. Johnson
  • 21
  • 3
  • 1
    Do not post your data as an image, please learn how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) – Jaap Dec 08 '15 at 19:27
  • isnt this just `as.numeric(factor(df1$date, unique(df1$date)))` by id? – rawr Dec 08 '15 at 20:03

2 Answers2

3

We can try

library(dplyr)
df1 %>%
    group_by(id) %>%
    mutate(counter= cumsum(c(TRUE, date[-1]!=date[-n()])))
#      id  date counter
#   (dbl) (chr)   (int)
#1     1     a       1
#2     1     a       1
#3     1     b       2
#4     1     b       2
#5     2     a       1
#6     2     a       1
#7     2     b       2

data

df1 <- data.frame(id= rep(c(1,2), c(4,3)), date= c('a', 'a', 
    'b', 'b', 'a', 'a', 'b'), stringsAsFactors=FALSE)
akrun
  • 874,273
  • 37
  • 540
  • 662
1

You could also use data.table and its rleid-function for this:

library(data.table)


dat <- data.table(id=rep(c(1,2),c(4,3)),
                  date=c('a','a','b','b','a','a','b'))

dat[,counter:=rleid(date),by=id]
dat
> dat
   id date counter
1:  1    a       1
2:  1    a       1
3:  1    b       2
4:  1    b       2
5:  2    a       1
6:  2    a       1
7:  2    b       2
Heroka
  • 12,889
  • 1
  • 28
  • 38