29

I would like to sort the following dataframe:

Region           LSE          North      South
0                   Cn     33.330367   9.178917
1               Develd     -36.157025 -27.669988
2               Wetnds    -38.480206 -46.089908
3                Oands    -47.986764 -32.324991
4               Otherg    323.209834  28.486310
5                 Soys      34.936147   4.072872
6                  Wht     0.983977 -14.972555

I would like to sort it so the LSE column is reordered based on the list:

lst = ['Oands','Wetnds','Develd','Cn','Soys','Otherg','Wht']

of, course the other columns will need to be reordered accordingly as well. Is there any way to do this in pandas?

user308827
  • 21,227
  • 87
  • 254
  • 417
  • This [question](http://stackoverflow.com/questions/13838405/custom-sorting-in-pandas-dataframe) might help. – YS-L Nov 03 '14 at 03:22

1 Answers1

34

The improved support for Categoricals in pandas version 0.15 allows you to do this easily:

df['LSE_cat'] = pd.Categorical(
    df['LSE'], 
    categories=['Oands','Wetnds','Develd','Cn','Soys','Otherg','Wht'], 
    ordered=True
)
df.sort('LSE_cat')
Out[5]: 
   Region     LSE       North      South LSE_cat
3       3   Oands  -47.986764 -32.324991   Oands
2       2  Wetnds  -38.480206 -46.089908  Wetnds
1       1  Develd  -36.157025 -27.669988  Develd
0       0      Cn   33.330367   9.178917      Cn
5       5    Soys   34.936147   4.072872    Soys
4       4  Otherg  323.209834  28.486310  Otherg
6       6     Wht    0.983977 -14.972555     Wht

If this is only a temporary ordering then keeping the LSE column as a Categorical may not be what you want, but if this ordering is something that you want to be able to make use of a few times in different contexts, Categoricals are a great solution.


In later versions of pandas, sort, has been replaced with sort_values, so you would need instead:

df.sort_values('LSE_cat')
user3483203
  • 50,081
  • 9
  • 65
  • 94
Marius
  • 58,213
  • 16
  • 107
  • 105
  • 11
    This is an old post, but as google sent me here, it's worth adding that for pandas version 0.23.1 (and likely earlier versions), `.sort` has been replaced so you need: `df.sort_values('LSE_cat', inplace=True)` – doctorer Jul 24 '18 at 08:35
  • This is useful! Is there also a way to fill in with NaN if a category does not appear? E.g. if Oands did not exist in the initial dataframe, you would still want the row to appear with NaNs at 'North' and 'South' columns. How to do this? – Newbielp Jul 07 '21 at 08:50