df%>%
group_by(variable1)%>%
summarise(length=length(levels(df$variable2))
group_by does not work and I have the same results for all the levels of the variable1.
df%>%
group_by(variable1)%>%
summarise(length=length(levels(df$variable2))
group_by does not work and I have the same results for all the levels of the variable1.
We need to remove df$
. The levels(df$variable2)
gets the levels
in the full dataset. For factor
variables, the unused levels remains unless we drop the levels with droplevels
.
df %>%
group_by(variable1)%>%
summarise(length=length(levels(droplevels(variable2))))
Also, instead of using the levels
route, we can use n_distinct
df %>%
group_by(variable1) %>%
summarise(length=n_distinct(variable2))
set.seed(24)
df <- data.frame(variable1=sample(letters[1:3],
10,replace=TRUE), variable2= sample(letters[1:5],10, replace=TRUE))