1

I have a dataset that looks like this:

SID     group  timepoint 
 5402       A    0
 5402       B    0
 5402       C    1
 5403       A    0     
25403       B    1
25403       C    1

I want to count how many unique SID's are there per group x timepoint. So far I have been doing the combinations 'by hand', like this:

length(unique(subset(df, timepoint=='0' & group=='A')$SID))
length(unique(subset(df, timepoint=='0' & group=='B')$SID))
...

Is there a way to use a grouping function that gives me a table with all the counts? (like by, or tapply?) Thanks!

Sininho
  • 287
  • 5
  • 13
  • I wonder, did you try Googling the title of your own question? Just by copy/pasting your title into Google I'm getting so many good search results. – David Arenburg Mar 31 '16 at 09:30
  • @DavidArenburg, I did try to search for it! But I think I focused on trying to find out how use tapply or by and forgot to search for the obvious :-( – Sininho Mar 31 '16 at 09:48

1 Answers1

3

We can use uniqueN from data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'group' and 'timepoint', get the length of unique elements of 'SID' (uniqueN(SID)).

 library(data.table)
 setDT(df1)[, .(UnSID=uniqueN(SID)), .(group, timepoint)]
akrun
  • 874,273
  • 37
  • 540
  • 662