1

Dataframe without using the .sort_values()

Please how can I sort this dataframe, such that for each year, the data is sorted in descending order. I tried using the .sort_values() I got this dataframe below and which is not what I want.

Dataframe when I use .sort_values()

A simple Dataframe that can be used to represent the given dataframe in the picture.

myDic = {'Custmers':['A1','A1','A1','B4','A7','B4'],'Month':['May','June','June','June','July','August'],'Type':['Bag','Bag','Food','Drink', 'Cow','Cup'],'Year':[2004,2004,2005,2004, 2005, 2004]}

df = pd.DataFrame(myDic)
df.groupby(['Year','Month']).count()

How do I sort this given dataframe such that the june row in 2004 comes before august?

Achebe Peter
  • 131
  • 8
  • Please provide minimal reproducible data in text form. It will make easier for people to help you – Hamza Nov 27 '20 at 23:25
  • 1
    Does this answer your question? [How to sort a Pandas DataFrame by index?](https://stackoverflow.com/questions/22211737/how-to-sort-a-pandas-dataframe-by-index) – cronoik Nov 27 '20 at 23:29
  • Hy Hamza, I have added a sample dataframe to represent what I wish to achieve. – Achebe Peter Nov 27 '20 at 23:54

2 Answers2

0

Simplest way to add a new column which gets numeric form of each month and sort by it. Which you can drop after sorting if you do not want it in result:

from time import strptime
df['Month_Num'] = [strptime(month,'%B').tm_mon for month in df.index.get_level_values('Month')]
df.sort_values(by = ['Year','Month_Num']).drop('Month_Num', axis =1)

P.S. Used your df.groupby result in same df variable. Result looks like:

                   Custmers  Type
Year    Month       
2004    May                1    1
        June               2    2
        August             1    1
2005    June               1    1
        July               1    1

If you want descending sorting for any of the axes, just pass a list of bools ro the sort_values() function as ascending = [False, True]

Hamza
  • 5,373
  • 3
  • 28
  • 43
0

This line of code seem to do the trick for me.

merged_df.groupby(['Year','Month_Name']).count().sort_values(['Year','Customer'], ascending = [True, False])
Achebe Peter
  • 131
  • 8