How to return the number of unique observations in each group of a data frame

Question

I have a data frame similar to this:

data <- data.frame(
  Location = rep(letters[1:10], each = 20),
  ID = rep(1:40, each = 5)
)

I want to return a table that contains each unique Location in one column and a count of the number of unique IDs in each Location in another column, so it will look like this:

Location   Count
   a         4
   b         4
   ...      ...

Note: in my actual data set there are different numbers of IDs in each Location, and there are other variables in other columns.

What is the best way to do this?

score 1 · Answer 1 · answered Apr 14 '20 at 19:09

1

The table class of objects has a as.data.frame method:

as.data.frame(table(data$Location))
   Var1 Freq
1     a   20
2     b   20
3     c   20
4     d   20
5     e   20
6     f   20
7     g   20
8     h   20
9     i   20
10    j   20

answered Apr 14 '20 at 19:09

IRTFM

258,963
21
364
487

score 0 · Accepted Answer · answered Apr 14 '20 at 18:56

We can use n_distinct on the 'ID' column after grouping by 'Location'. In the example, it is all 4

library(dplyr)
data %>% 
    group_by(Location) %>%
    summarise(Count = n_distinct(ID))

If we need to add a new column, use mutate instead of summarise

With data.table, this can be done with uniqueN

library(data.table)
setDT(data)[, .(Count = uniqueN(ID)), Location]

How to return the number of unique observations in each group of a data frame

2 Answers2