Count subgroups in group_by with dplyr

Question

I'm stuck trying to do some counting on a data frame. The gist is to group by one variable and then break further by groups based on a second variable. From here I want to count the size if the subgroups for each group. The sample code is this:

set.seed(123456)
df <- data.frame(User = c(rep("A", 5), rep("B", 4), rep("C", 6)), 
                 Rank = c(rpois(5,1), rpois(4,2), rpois(6,3)))

#This results in an error
df %>% group_by(User) %>% group_by(Rank) %>% summarize(Res = n_groups())

So what I want is 'User A' to have 3, 'User B' to have 4, and 'User C' to have 5. In other words the data frame df would end up looking like:

   User Rank Result
1     A    2      3
2     A    2      3
3     A    1      3
4     A    0      3
5     A    0      3
6     B    1      4
7     B    2      4
8     B    0      4
9     B    6      4
10    C    1      5
11    C    4      5
12    C    3      5
13    C    5      5
14    C    5      5
15    C    8      5

I'm still learning dplyr, so I'm unsure how I should do it. How can this be achieved? Non-dplyr answers are also very welcome. Thanks in advance!

thc · Accepted Answer · 2017-04-29T01:40:08.313

6

Try this:

df %>% group_by(User) %>% mutate(Result=length(unique(Rank)))

Or (see comment below):

df %>% group_by(User) %>% mutate(Result=n_distinct(Rank))

edited Apr 29 '17 at 01:40

answered Apr 29 '17 at 01:38

thc

9,527
1
24
39

4

There's `n_distinct` for that, fyi – Frank Apr 29 '17 at 01:39
2

Thanks, did not know! – thc Apr 29 '17 at 01:39
I feel like such a dope sometimes. I knew there had to be a function for this! Thanks to you both! – J. Paul Apr 29 '17 at 01:47

score 1 · Answer 2 · answered Apr 29 '17 at 06:36

A base R option would be using ave

df$Result <- with(df, ave(Rank, User, FUN = function(x) length(unique(x))))
df$Result
#[1] 3 3 3 3 3 4 4 4 4 5 5 5 5 5 5

and a data.table option is

library(data.table)
setDT(df)[, Result := uniqueN(Rank), by = User]

Count subgroups in group_by with dplyr

2 Answers2