0

Still new into pandas but is there a way to sort df by subtotal of each group.

 Area   Unit   Count
  A     A1      5
  A     A2      2
  B     B1      10
  B     B2      1 
  B     B3      3
  C     C1      10

So I want to sort them by subtotal of each Area which results to A subtotal = 7, B subtotal=14, C subtotal = 10 The sort should be like

 Area   Unit   Count
  B     B1      10
  B     B2      1 
  B     B3      3
  C     C1      10
  A     A1      5
  A     A2      2

*Note that despite the value of B3 > B1 it should not be affected by the sort.

  • 1
    possible duplicate of [Pandas group-by and sum](https://stackoverflow.com/questions/39922986/pandas-group-by-and-sum) – tidakdiinginkan Apr 21 '20 at 03:34
  • It is slightly different as I needed to sort them just by the group that has a higher value. Thanks anyway the answer is below.. – Mohamed Afiq Apr 21 '20 at 08:04

1 Answers1

1

create a helper column 'sorter', which is the sum of the count variable, and sort ur dataframe with it

df['sorter'] = df.groupby("Area").Count.transform('sum')

df.sort_values('sorter',ascending=False).reset_index(drop=True).drop('sorter',axis=1)


  Area  Unit    Count
0   B   B1       10
1   B   B2      1
2   B   B3      3
3   C   C1      10
4   A   A1      5
5   A   A2      2
sammywemmy
  • 27,093
  • 4
  • 17
  • 31