I want to get the number of unique values from one column grouped by another column using dplyr. Preferable function friendly, that is i can put this in a function and it will work easily.
So for example for the following data frame.
test = data.frame(one=rep(letters[1:5],each=2), two=c(rep("c", 3), rep("d", 2), rep("e", 4), "f") )
one two
1 a c
2 a c
3 b c
4 b d
5 c d
6 c e
7 d e
8 d e
9 e e
10 e f
I would want something like the number of unique values column two gives column one.
Desired output:
one n
1 a 1
2 b 2
3 c 2
4 d 1
5 e 2
From column one, a has 1 unique value "c" only, b has 2 unique value "c" and "d", c has 2 unique values "d" and "e", d has 1 unique value "e".
I managed to get something working by group_by() twice and summarize(), is there a more simple way i could use?
Hope this is understandable.
Thanks