0
DF<-data.frame(id=c(1,1,2,3,3),code=c("A","A","A","E","E"))
> DF
  id code
1  1    A
2  1    A
3  2    A
4  3    E
5  3    E

Now I want to count nr id with same code. Desired output:

# A tibble: 2 x 2

  code  count
1 A         2
2 E         1

I´v been trying:

> DF%>%group_by(code)%>%summarize(count=n())
# A tibble: 2 x 2
  code  count
  <fct> <int>
1 A         3
2 E         2
> DF%>%group_by(code,id)%>%summarize(count=n())
# A tibble: 3 x 3
# Groups:   code [2]
  code     id count
  <fct> <dbl> <int>
1 A         1     2
2 A         2     1
3 E         3     2
> 

Which doesn´t give me the desired output.

Best H

hklovs
  • 611
  • 1
  • 4
  • 16

3 Answers3

2

Being pedantic, I'd rephrase your question as "count the number of distinct IDs per code". With that mindset, the answer becomes clearer.

DF %>% 
  group_by(code) %>%
  summarize(count = n_distinct(id))
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
1

An option with data.table would be uniqueN (instead of n_distinct from dplyr) after grouping by 'code' and converting to data.table (setDT)

library(data.table)
setDT(DF)[, .(count = uniqueN(id)), code]
#   code count
#1:    A     2
#2:    E     1
akrun
  • 874,273
  • 37
  • 540
  • 662
0

A simple base R solution also works:

#Data
DF<-data.frame(id=c(1,1,2,3,3),code=c("A","A","A","E","E"))
#Classic base R sol
aggregate(id~code,data=DF,FUN = function(x) length(unique(x)))

  code id
1    A  2
2    E  1
Duck
  • 39,058
  • 13
  • 42
  • 84