5

I am trying to obtain a sequence within category.

My data are:

A B 
1 1 
1 2
1 2
1 3
1 3
1 3
1 4
1 4

and I want to get variable "c" such as my data look like:

A B C
1 1 1
1 2 1
1 2 2
1 3 1
1 3 2
1 3 3
1 4 1
1 4 2
sgibb
  • 25,396
  • 3
  • 68
  • 74
ShrestR
  • 285
  • 1
  • 3
  • 8

1 Answers1

14

Use ave with seq_along:

> mydf$C <- with(mydf, ave(A, A, B, FUN = seq_along))
> mydf
  A B C
1 1 1 1
2 1 2 1
3 1 2 2
4 1 3 1
5 1 3 2
6 1 3 3
7 1 4 1
8 1 4 2

If your data are already ordered (as they are in this case), you can also use sequence with rle (mydf$C <- sequence(rle(do.call(paste, mydf))$lengths)), but you don't have that limitation with ave.

If you're a data.table fan, you can make use of .N as follows:

library(data.table)
DT <- data.table(mydf)
DT[, C := sequence(.N), by = c("A", "B")]
DT
#    A B C
# 1: 1 1 1
# 2: 1 2 1
# 3: 1 2 2
# 4: 1 3 1
# 5: 1 3 2
# 6: 1 3 3
# 7: 1 4 1
# 8: 1 4 2
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • I had a similar problem in my code and I was able to solve this using data.table. I saw on stackoverflow a lot of other answers recommend using ave () but I'm not entire sure how ave is supposed to work. Here is a link the r documentation. ave reminds me a lot of lapply and other functions. Could you expand upon the first method? https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/ave – Eric Boorman Mar 29 '22 at 17:46