I am trying to make a code to remove rows conditional on the number of a categorical variable. For example, if I want to count the number that each label in a categorical has, the result using the table function is as follows. A: 500, B: 300, C: 90, D: 15, E: 200, F: 300
I would like to remove the rows with a value of that categorical variable with less than 100 observations. In this case, I should remove the rows with the categorical variable having C and D.
I can do this semi-manually by the process: 1. use the table function and check. 2. data[! data$categorical %in% c("C", "D",]
However, I think this is tedious if the categorical variable gets larger and more complex. Does anyone know how to do this in one step so that I can apply it to a larger dataset? I would really appreciate it if you teach me.
Take care