0
df%>%
    group_by(variable1)%>%
    summarise(length=length(levels(df$variable2))

group_by does not work and I have the same results for all the levels of the variable1.

Spigonico
  • 137
  • 1
  • 10

1 Answers1

5

We need to remove df$. The levels(df$variable2) gets the levels in the full dataset. For factor variables, the unused levels remains unless we drop the levels with droplevels.

df %>%
   group_by(variable1)%>%
   summarise(length=length(levels(droplevels(variable2))))

Also, instead of using the levels route, we can use n_distinct

 df %>% 
   group_by(variable1) %>% 
   summarise(length=n_distinct(variable2))

data

set.seed(24)
df <- data.frame(variable1=sample(letters[1:3], 
   10,replace=TRUE), variable2= sample(letters[1:5],10, replace=TRUE))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    @DavidArenburg I think the question was straightforward when we consider the general behavior of `levels` in a `factor`. – akrun Jan 24 '16 at 13:19