I have a dataframe with 1000 IDs, each with > 100 rows of data. I want to remove all IDs that meet a criteria based on another column at least once.
As an example with the dummy data below, I want to remove all IDs, where var2 is <20 at least once.
How do I do this without spelling out each individual ID to be dropped?
dummy data of similar structure:
data <- data.frame(ID = rep(c('B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'B8', 'B9', 'B10'), each = 5),
var1 = rep(c('a', 'b', 'b', 'c', 'd','a', 'c', 'c', 'b', 'a' ), times = 5),
var2 = sample(1:100, 50))
I have tried using the function droplevel
, but I do not want to spell out every individual ID to be dropped.