10

I have a data-frame in R with two columns. The first column contains the subjectID and the second column contains the trial ID that subject has done.

The a specific subjectID might have done the trial for more than 1 time. I want to add a column with a counter that starts counting for each subject-trial unique value and increment by 1 till it reaches the last row with that occurance.

More precisely, I have this table:

ID T
A  1
A  1
A  2
A  2
B  1
B  1
B  1
B  1

and I want the following output

ID  T  Index
A   1   1
A   1   2
A   2   1
A   2   2
B   1   1
B   1   2
B   1   3
B   1   4
joran
  • 169,992
  • 32
  • 429
  • 468
BICube
  • 4,451
  • 1
  • 23
  • 44
  • ...and many, many others. – joran Nov 07 '13 at 22:59
  • Hi, welcome to SO. Since you are quite new here, you might want to read the [**about**](http://stackoverflow.com/about) and [**FAQ**](http://stackoverflow.com/faq) sections of the website to help you get the most out of it. If an answer does solve your problem you may want to *consider* upvoting and/or marking it as accepted to show the question has been answered, by ticking the little green check mark next to the suitable answer. You are **not** obliged to do this, but it helps keep the site clean of unanswered questions and rewards those who take the time to solve your problem. – Simon O'Hanlon Nov 07 '13 at 23:08
  • thanks for letting me know. But I have been trying to up-vote your answer since yesterday. But it doesn't let me do this. I still have < 15 reputation score. – BICube Nov 08 '13 at 17:35

1 Answers1

11

I really like the simple syntax of data.table for this (not to mention speed)...

#  Load package
require( data.table )
#  Turn data.frame into a data.table
dt <- data.table( df )

#  Get running count by ID and T
dt[ , Index := 1:.N , by = c("ID" , "T") ]
#   ID T Index
#1:  A 1     1
#2:  A 1     2
#3:  A 2     1
#4:  A 2     2
#5:  B 1     1
#6:  B 1     2
#7:  B 1     3
#8:  B 1     4

.N is an integer equal to the number of rows in each group. The groups are defined by the column names in the by argument, so 1:.N gives a vector as long as the group.

As data.table inherits from data.frame any function that takes a data.frame as input will also take a data.table as input and you can easily convert back if you wished ( df <- data.frame( dt ) )

Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184