0
> diamonds %>% group_by(color) %>% tally %>% arrange(desc(n))
# A tibble: 7 x 2
  color     n
  <ord> <int>
1 G     11292
2 E      9797
3 F      9542
4 H      8304
5 D      6775
6 I      5422
7 J      2808

I would like to filter diamonds to exclude any groups where their total count is less than 6K.

I was thinking I could group by and nest() then unnest() but wondered if there was a less code more elegant way using a window function to filter on? I was reading docs here but could not immediately see a way to filter based on group counts.

How could I filter diamonds to exclude rows containing color I or J since I and J have a total count each of less than 6K?

Doug Fir
  • 19,971
  • 47
  • 169
  • 299

1 Answers1

-1

We can directly use filter after the grouping step

library(dplyr)
diamonds %>% 
      group_by(color) %>% 
      filter(n() > 6000)
akrun
  • 874,273
  • 37
  • 540
  • 662