I have a dataframe with abbreviated month, day_of_month, and some additional data. I am trying group by the month and the sort by the month. But I end up with an alphabetical order in the result. Apr,Aug,Feb... Instead of Jan,Feb,Mar...
Sample code:
np.random.seed(1)
n=30
df=pd.DataFrame({'months':np.random.choice(['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'],size=n),
'data':np.random.randint(1000,size=n),
'day_of_month':np.random.randint(1,31,size=n),
'filename':np.random.choice(['f1','blah','foo','bar','meh'],size=n)})
df.groupby(["months","day_of_month"]).count()
Here is the sample output:
months day_of_month data filename
Aug 6 1 1
24 1 1
26 1 1
30 1 1
Dec 10 1 1
17 1 1
23 1 1
Feb 5 1 1
28 1 1
Jan 1 1 1
16 1 1
26 1 1
Jul 16 1 1
20 1 1
Jun 19 1 1
21 1 1
27 1 1
How do I ensure that the dataframe is grouped by month and day_of_month, and then sorted by the months in proper time order?