I have a dataset, espana2015, of a country with schools, students…. I want to eliminate schools with less than 20 students. The variable of the schools is CNTSCHID
dim(espana2015)
[1] 6736 106
The only way, long, manual and not very efficient, is to write one by one the schools. Here are only 13 schools with less than 20 students, but what if there are many more, e.g. more than 100 schools?
espana2015 %>% group_by(CNTSCHID) %>% summarise(students=n())%>%
filter(students < 20) %>% select (CNTSCHID) ->removeSch
removeSch
# A tibble: 13 x 1
CNTSCHID
<dbl>
1 72400046
2 72400113
3 72400261
4 72400314
5 72400396
6 72400472
7 72400641
8 72400700
9 72400711
10 72400736
11 72400909
12 72400927
13 72400979
espana2015 %>% subset(!CNTSCHID %in% c(72400046,72400113,72400261,
72400314,72400396,72400472,
72400641,72400700,72400711,
72400736,72400909,72400927,
72400979)) -> new_espana2015
Please help me to do it better Walter