I want to group my dataframe by 2 columns (orig
and dest
) to calculate the total qty
per orig-dest
pair but don't want the group by to change the order. I need the results of the groupby
to be ordered based on the ascending Drivetime
.
In the e.g. below the drivetime from A to B1
is 1 and from A to A1
is 4 and so I need the groupby
to return A to B1
as the first result and A to A1
as the second result. Right now, the results default to being alphabetically ordered. The reason I need it this way is because in the next step I want to calculate the cumulative percentage of qty
per orig
, and therefore the order matters. Can someone please help.
import pandas as pd
#Create dataframe
df = pd.DataFrame({'Orig': ['A', 'A', 'A', 'B', 'C'] ,
'Dest': ['A1', 'A1', 'B1', 'B2','C1'] ,
'Qty': [100, 200, 300, 400, 500],
'Drivetime':[4,4,1,3,5]})
#Rank the data for each orig in increasing order of drivetimes
df.sort_values(by =['Orig', 'Drivetime'], inplace=True)
df
#Calculate the total quantity per orig
total_qty_per_orig = df.groupby(['Orig', 'Dest']).agg({'Qty': 'sum'})
total_qty_per_orig