I want to keep a sample of rows that contain a certain value, with a limit of 3 rows per value.
For example, say I want to keep a maximum of 3 rows per colour:
X1 X2
1 0.7091409 RED
2 -1.1334614 BLUE
3 2.3343391 RED
4 -0.9040278 GREEN
5 0.4180331 RED
6 0.7572246 RED
7 -0.8996483 BLUE
8 -1.0356774 BLUE
9 -0.3983045 GREEN
10 -0.9060305 BLUE
Here, in column X2, RED appears 4 times, BLUE appears 4 times, and GREEN appears 2 times. I want to trim the rows to keep a maximum of 3 rows that includes a specific value in column X2. So the above dataset would become:
X1 X2
1 0.7091409 RED
2 -1.1334614 BLUE
3 2.3343391 RED
4 -0.9040278 GREEN
5 0.4180331 RED
6 -0.8996483 BLUE
7 -1.0356774 BLUE
8 -0.3983045 GREEN
Any ideas on how to achieve this?