I have some large dataset (more than 500 000 rows) and I want to filter it in R. I just want to retain the most relevant information so I thought that it would be a good idea to just save the rows whose elements have an occurrence greater than some value. For example I have this data:
A B
2 5
4 7
2 8
3 7
2 9
4 2
1 0
And I want to retain the rows whose element of the A row has an occurrence greater than 1. In this case the output will be:
A B
2 5
4 7
2 8
2 9
4 2
I know how to do it with for loops and rbind but since the dataset I am using is very big the performance is greatly hindered. Any advice?