dplyr chain filter based on frequency

Question

table(mtcars$cyl)

 4  6  8 
11  7 14

Suppose I wanted to filter low frequency terms, in this case less than 10. Is there an elegant dplyr esque way to do this?

mtcars %>% group_by(cyl) %>% filter([???])

The result would be a data frame with 4 and 8 cyl only, since they both occur 10 or more times.

Possible duplicate of [Return df with a columns values that occur more than once](https://stackoverflow.com/questions/24503279/return-df-with-a-columns-values-that-occur-more-than-once), Also [Returning observations that only occur once in a group in R](https://stackoverflow.com/questions/36145061/returning-observations-that-only-occur-once-in-a-group-in-r) and [Subset data frame based on number of rows per group](https://stackoverflow.com/questions/20204257/subset-data-frame-based-on-number-of-rows-per-group) — Ronak Shah, Dec 19 '17 at 05:02
What's the protocol here? I would delete since People are downvoting however the question has been answered, so that would be unfair to that person. Also, I did a Google search using keyword "frequency" when first attempting to solve this which did not return any of the above answers, so who knows, maybe this question will help people searching who use that term — Doug Fir, Dec 19 '17 at 06:50

neilfws · Accepted Answer · 2017-12-19T04:57:32.513

3

Group by cyl, count the rows, filter, optionally remove the freq column:

library(dplyr)
mtcars %>% 
  group_by(cyl) %>% 
  mutate(freq = n()) %>% 
  ungroup() %>% 
  filter(freq > 9) %>%
  select(-freq)

edited Dec 19 '17 at 04:57

answered Dec 19 '17 at 04:56

neilfws

Thanks, what does ungroup() do? – Doug Fir Dec 19 '17 at 04:57
It removes the grouping by `cyl`. Generally best to `ungroup` once the procedure is complete as leaving it there can have unintended consequences later. – neilfws Dec 19 '17 at 04:58
4

why not `mtcars %>% group_by(cyl) %>% filter(n() > 9)` ? – Ronak Shah Dec 19 '17 at 04:58
why not indeed :) I prefer explicit variables – neilfws Dec 19 '17 at 04:59

1 Answers1