12

Here is my dataframe

import pandas as pd
df = pd.DataFrame({'A': ['one', 'one', 'two', 'two', 'one'] ,
                   'B': ['Ar', 'Br', 'Cr', 'Ar','Ar'] ,
                   'C': ['12/15/2011', '11/11/2001', '08/30/2015', '07/3/1999','03/03/2000' ],
                      'D':[1,7,3,4,5]})

My goal is to group by column A and sort within grouped results by column B.

Here is what I came up with:

sort_group = df.sort_values('B').groupby('A')

I was hoping that grouping operation would not distort order, but it does not work and also returns not a dataframe, but groupby object

<pandas.core.groupby.DataFrameGroupBy object at 0x0000000008B190B8>

Any suggestions?

Cleb
  • 25,102
  • 20
  • 116
  • 151
user1700890
  • 7,144
  • 18
  • 87
  • 183
  • Possible duplicate of [pandas groupby sort within groups](https://stackoverflow.com/questions/27842613/pandas-groupby-sort-within-groups) – Sean.H Jan 13 '19 at 15:25

2 Answers2

25

You cannot apply sort_values directly to a groupby object but you need an apply:

df.groupby('A').apply(lambda x: x.sort_values('B'))

gives you the desired output:

         A   B           C  D
A                            
one 0  one  Ar  12/15/2011  1
    4  one  Ar  03/03/2000  5
    1  one  Br  11/11/2001  7
two 3  two  Ar   07/3/1999  4
    2  two  Cr  08/30/2015  3
Cleb
  • 25,102
  • 20
  • 116
  • 151
1

I usually use only sort_values to indirectly group values based on column A and sort within the groups by column B. This is:

sort_group = df.sort_values(['A', 'B'])

which will give you this:

    A   B          C    D
0   one Ar  12/15/2011  1
4   one Ar  03/03/2000  5
1   one Br  11/11/2001  7
3   two Ar  07/3/1999   4
2   two Cr  08/30/2015  3

This will return a normal DataFrame where you continue your analysis.

Veronica
  • 41
  • 4