r number of a factor per unique values

Question

I have the following dataset:

Names   Category
Jack    1
Jack    1
Jack    1
Tom     0
Tom     0
Sara    0
Sara    0

what I am looking for is the following:

Category Number
0        2
1        1

that is, the number of unique values in column Names per each category.

I can get the number of unique values in the first column:

length(unique(df$Names))

and the total repeated number of categories in the second column:

length(which(df$Category== 1))

but this is not the result i am looking for.

score 1 · Accepted Answer · answered Dec 17 '16 at 18:13

1

Or aggregate in base R:

aggregate(Names ~ Category, data=df, FUN=function(x) length(unique(x)))
  Category Names
1        0     2
2        1     1

answered Dec 17 '16 at 18:13

lmo

37,904
9
56
69

score 0 · Answer 2 · answered Dec 17 '16 at 18:14

0

Using data.table

library(data.table)
setDT(df)[, .(Number =uniqueN(Names)), by = Category]
#    Category Number
#1:        1      1
#2:        0      2

answered Dec 17 '16 at 18:14

akrun

874,273
37
540
662

smci · Answer 3 · 2016-12-26T20:46:57.663

Using dplyr. You don't even need to manually get the unique Names first:

df <- data.frame(Names=c(rep('Jack',3),rep('Tom',2),rep('Sara',2)),
                 Category=c(1,1,1,0,0,0,0))
require(dplyr)

df %>% group_by(Category) %>% summarize(Number = n_distinct(Names))

  Category Number
     <dbl>  <int>
1        0      2
2        1      1

# and you can use as.data.frame(...) on that if you like

UPDATED: it was not clear OP's original wording they wanted to first group-by Category, then count number of distinct Names within each group.

r number of a factor per unique values

3 Answers3