Create a running counter variable within by group

Question

I have a simple dataset with an id variable and date variable, and would like to create a counter variable (counter) that increments whenever date changes within the id variable. Assume the data is sorted by id and date, and that a specific date may appear any number of times within an id. This is very easily done in other languages (SAS with retain or Stata with by: and _n/_N), but I haven't found a very efficient way in R.

Final data:

Do not post your data as an image, please learn how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) — Jaap, Dec 08 '15 at 19:27
isnt this just `as.numeric(factor(df1$date, unique(df1$date)))` by id? — rawr, Dec 08 '15 at 20:03

akrun · Accepted Answer · 2015-12-08T19:13:34.413

3

We can try

library(dplyr)
df1 %>%
    group_by(id) %>%
    mutate(counter= cumsum(c(TRUE, date[-1]!=date[-n()])))
#      id  date counter
#   (dbl) (chr)   (int)
#1     1     a       1
#2     1     a       1
#3     1     b       2
#4     1     b       2
#5     2     a       1
#6     2     a       1
#7     2     b       2

data

df1 <- data.frame(id= rep(c(1,2), c(4,3)), date= c('a', 'a', 
    'b', 'b', 'a', 'a', 'b'), stringsAsFactors=FALSE)

edited Dec 08 '15 at 19:13

answered Dec 08 '15 at 19:05

akrun

874,273
37
540
662

Exactly what I needed. Thanks! – C. Johnson Dec 08 '15 at 19:13

score 1 · Answer 2 · answered Dec 08 '15 at 19:23

You could also use data.table and its rleid-function for this:

library(data.table)


dat <- data.table(id=rep(c(1,2),c(4,3)),
                  date=c('a','a','b','b','a','a','b'))

dat[,counter:=rleid(date),by=id]
dat
> dat
   id date counter
1:  1    a       1
2:  1    a       1
3:  1    b       2
4:  1    b       2
5:  2    a       1
6:  2    a       1
7:  2    b       2

Create a running counter variable within by group

2 Answers2

data