Sequentially numbering within many row blocks of unequal length

Question

My actual dataset is composed of repeated measurements for each id, where the number of measurements can vary across individuals. A simplified example is:

dat <- data.frame(id = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L))
dat
##    id
## 1   1
## 2   1
## 3   1
## 4   1
## 5   1
## 6   1
## 7   2
## 8   2
## 9   3
## 10  3
## 11  3

I am trying to sequentially number the dat rows by the id variable. The result should be:

dat
##    id s
## 1   1 1
## 2   1 2
## 3   1 3
## 4   1 4
## 5   1 5
## 6   1 6
## 7   2 1
## 8   2 2
## 9   3 1
## 10  3 2
## 11  3 3

How would you do that? I tried to select the last row of each id by using duplicated(), but this is probably not the way, since it works with the entire column.

This question seems pretty similar. In fact, I see now that is where I got my answer: http://stackoverflow.com/questions/8209015/observation-number-by-group — Mark Miller, Jan 12 '13 at 16:08

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2017-12-30T11:37:35.903

Use ave(). The first item is the item you're going to apply the function to; the other items are your grouping variables, and FUN is the function you want to apply. See ?ave for more details.

transform(dat, s = ave(id, id, FUN = seq_along))
#    id s
# 1   1 1
# 2   1 2
# 3   1 3
# 4   1 4
# 5   1 5
# 6   1 6
# 7   2 1
# 8   2 2
# 9   3 1
# 10  3 2
# 11  3 3

If you have a large dataset or are using the data.table package, you can make use of ".N" as follows:

library(data.table)
DT <- data.table(dat)
DT[, s := 1:.N, by = "id"]
## Or
## DT[, s := sequence(.N), id][]

Or, you can use rowid, like this:

library(data.table)
setDT(dat)[, s := rowid(id)][]
#     id s
#  1:  1 1
#  2:  1 2
#  3:  1 3
#  4:  1 4
#  5:  1 5
#  6:  1 6
#  7:  2 1
#  8:  2 2
#  9:  3 1
# 10:  3 2
# 11:  3 3

For completeness, here's the "tidyverse" approach:

library(tidyverse)
dat %>% 
  group_by(id) %>% 
  mutate(s = row_number(id))
## # A tibble: 11 x 2
## # Groups: id [3]
##       id     s
##    <int> <int>
##  1     1     1
##  2     1     2
##  3     1     3
##  4     1     4
##  5     1     5
##  6     1     6
##  7     2     1
##  8     2     2
##  9     3     1
## 10     3     2
## 11     3     3

score 3 · Answer 2 · answered Jan 12 '13 at 15:58

dat <- read.table(text = "
    id          
    1 
    1 
    1 
    1 
    1 
    1 
    2 
    2 
    3 
    3 
    3", 
header=TRUE)

data.frame(
    id = dat$id,
    s = sequence(rle(dat$id)$lengths) 
)

Gives:

score 1 · Answer 3 · answered Jan 12 '13 at 15:56

1

using tapply but not elegant as ave

 cbind(dat$id,unlist(tapply(dat$id,dat$id,seq_along)))
  [,1] [,2]
11    1    1
12    1    2
13    1    3
14    1    4
15    1    5
16    1    6
21    2    1
22    2    2
31    3    1
32    3    2
33    3    3

answered Jan 12 '13 at 15:56

agstudy

119,832
17
199
261

If you look at the function for `ave()`, you'll see that it contains [your question from earlier today](http://stackoverflow.com/q/14294052/1270695) ;) – A5C1D2H2I1M1N2O1R2T1 Jan 12 '13 at 15:59
@AnandaMahto thanks but I know that. You were faster than me with the ave,I change mine last minute. – agstudy Jan 12 '13 at 18:53

Sequentially numbering within many row blocks of unequal length

3 Answers3

Linked