I have a df that contains daily product and volume data:
date product volume
20160101 A 10
20160101 B 5
...
20160102 A 20
...
...
20160328 B 20
20160328 C 100
...
20160330 D 20
I've grouped it up by month via
df['yearmonth'] = df.date.astype(str).str[:6]
grouped = df.groupby(['yearmonth','product'])['Volume'].sum()
which gives me a Series of the form:
yearmonth product
201601 A 100
B 90
C 90
D 85
E 180
F 50
...
201602 A 200
C 120
F 220
G 40
I 50
...
201603 B 120
C 110
D 110
...
I want to return the top n volume values per product per month. For example the top 3 values would return:
201601 A 100
B 90
C 90
E 180
201602 A 200
C 120
F 220
201603 B 120
C 110
D 110
I can find some answers using pd.IndexSlice
and select
but they seem to act on the index alone. I can't figure out how to sort the individual group's values
- Pandas report top-n in group and pivot (which is Wes's example in "Python for Data Analysis" too)
- pandas multi index sort specific fields
- pandas: slice a MultiIndex by range of secondary index