I have a dataset like this:
year | age | value
----------------------
2005 | 8 | 10
2005 | 8 | 12
2005 | 8 | 30
2005 | 8 | 12
2006 | 5 | 10
2006 | 5 | 20
2006 | 5 | 15
2006 | 5 | 20
2007 | 8 | 16
2007 | 8 | 20
2007 | 8 | 18
2007 | 5 | 20
about 50000 rows
What I want is to get a subset for the highest 30percentage for the values of each subgroup--- like this
year | age | value
----------------------
2005 | 8 | 30
2005 | 8 | 30
2006 | 5 | 20
2006 | 5 | 20
2007 | 8 | 20
2007 | 5 | 20
that means: the rows withe the higest 30% value from year:2005,age:8, year:2005,age:5, year:2006,age:8, year:2005,age:5 ...and so on