0

I have the following DataFrame:

df = pd.DataFrame(
    {
        'OrderDayName':['Friday', 'Monday', 'Saturday', 'Sunday', 'Thursday', 'Tuesday', 'Wednesday'],
        'ItemTotal':[4073.4199999999996, 6787.059999999996, 2965.2599999999984, 4416.439999999998, 4260.839999999998, 4378.229999999999, 3476.1600000000008]
    }
)

DataFrame1

I want to order the DataFrame based on the day of the week, meaning it should start with Monday and end with Sunday.

I tried copying the code in the following posts

pandas dataframe group and sort by weekday

Error: astype() got an unexpected keyword argument 'categories'

but the code below didn't work for me:

from pandas.api.types import CategoricalDtype
cats = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
cat_type = CategoricalDtype(categories=cats, ordered=True)
df['OrderDayName'] = df['OrderDayName'].astype(cat_type)

The DataFrame doesn't change - it is exactly the same as the one pictured above.

Can someone show me what I've done wrong?

Thanks

Edit: I think I got it. The following works:

cats = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
df = df.groupby(['OrderDayName']).sum().reindex(cats) 
df = df.reset_index(drop=False)
IamWarmduscher
  • 875
  • 2
  • 10
  • 27
  • 1
    Your problem is that you only convert `OrderDayName` to CategoricalDtype, but you didn't sort on that column. You can use `df.sort_values('OrderDayName')` to sort on `OrderDayName` column. – Ynjxsjmh Apr 04 '22 at 06:10
  • You can also do something like: `df.set_index('OrderDayName').loc[cats].reset_index()`. This is similar to this question: https://stackoverflow.com/questions/30009948/how-to-reorder-indexed-rows-based-on-a-list-in-pandas-data-frame/71714574 –  Apr 04 '22 at 06:39

1 Answers1

1

Pandas 0.15 introduces Categorical data. With this, one is able to define the order being used:

import pandas as pd

df = pd.DataFrame(
    {
        'OrderDayName': ['Friday', 'Monday', 'Saturday', 'Sunday', 'Thursday', 'Tuesday', 'Wednesday'],
        'ItemTotal': [4073, 6787, 2965, 4416, 4260, 4378, 3476],
    }
)
sorted_weekdays = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
df['OrderDayName'] = pd.Categorical(df['OrderDayName'], sorted_weekdays)
print(df.sort_values("OrderDayName"))
Christian Karcher
  • 2,533
  • 1
  • 12
  • 17