To clarify the question I'll briefly describe the data.
Each row in the data.frame
is an observation, and the columns represent variables pertinent to that observation including: what individual was observed, when it was observed, where it was observed, etc. I want to exclude/filter individuals for which there are fewer than 5 observations.
In other words, if there are fewer than 5 rows where individual = x, then I want to remove all rows that contain individual x and reassign the result to a new data.frame
. I'm aware of some brute force techniques using something like names == unique(df$individualname)
and then subsetting out those names individually and applying nrow
to determine whether or not to exclude them...but there has to be a better way. Any help is appreciated, I'm still pretty new to R.