1

There is a similar question here but not exactly what I'm looking for.

I want to sort a dataframe based on a dictionary that specifies the column(s) to sort by as well as the order for each column.

Example:

df =
+-------+-------+-----------+------+
| Index | Time  |   Month   | Year |
+-------+-------+-----------+------+
|     0 | 13:00 | January   | 2018 |
|     1 | 14:30 | March     | 2015 |
|     2 | 12:00 | November  | 2003 |
|     3 | 10:15 | September | 2012 |
|     4 | 13:30 | October   | 2012 |
|     5 | 06:25 | June      | 2012 |
|     6 | 07:50 | August    | 2019 |
|     7 | 09:20 | May       | 2015 |
|     8 | 22:30 | July      | 2016 |
|     9 | 23:05 | April     | 2013 |
|    10 | 21:10 | April     | 2008 |
+-------+-------+-----------+------+

sort_dict = {'Month': 'Ascending', 'Year': 'Descending', 'Time': 'Ascending'}

df.sort_values(by=sort_dict)

df = 
+-------+-------+-----------+------+
| Index | Time  |   Month   | Year |
+-------+-------+-----------+------+
|     0 | 13:00 | January   | 2018 |
|     1 | 14:30 | March     | 2015 |
|     9 | 23:05 | April     | 2013 |
|    10 | 21:10 | April     | 2008 |
|     7 | 09:20 | May       | 2015 |
|     5 | 06:25 | June      | 2012 |
|     8 | 22:30 | July      | 2016 |
|     6 | 07:50 | August    | 2019 |
|     3 | 10:15 | September | 2012 |
|     4 | 13:30 | October   | 2012 |
|     2 | 12:00 | November  | 2003 |
+-------+-------+-----------+------+

Any help is appreciated thanks!

Column index would also be fine:

sort_dict = {2: 'Ascending', 3: 'Descending', 1: 'Ascending'}
Izak Joubert
  • 906
  • 11
  • 29

1 Answers1

2

EDIT: (thanks @Jon Clements)

In python 3.6 declaring sort_dict the key order will be as specified, however, prior to 3.6, dict literals won't necessarily preserve order. eg. in 3.5, declaring sort_dict ends up being {'Month': 'Ascending', 'Time': 'Ascending', 'Year': 'Descending'}... which is going to be a different iteration order - thus different sort results.

If need always same order is possible use OrderedDict or Series by constructor - there order not depends of version of python.

One possible solution is create helper Series, then convert index to list and pass also parameter ascending filled boolean list:

s = pd.Series(sort_dict)
print (s)
Month     Ascending
Year     Descending
Time      Ascending
dtype: object

df = df.sort_values(by=s.index.tolist(), ascending = (s == 'Ascending'))
print (df)
        Time      Month  Year
Index                        
9      23:05      April  2013
10     21:10      April  2008
6      07:50     August  2019
0      13:00    January  2018
8      22:30       July  2016
5      06:25       June  2012
1      14:30      March  2015
7      09:20        May  2015
2      12:00   November  2003
4      13:30    October  2012
3      10:15  September  2012
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 3
    Should probably warn the OP here that starting with a `dict` is going to potentially cause issues... In 3.6... declaring `sort_dict` the key order will be as specified, however, prior to 3.6, dict literals won't necessarily preserve order... eg... in 3.5, declaring `sort_dict` ends up being `{'Month': 'Ascending', 'Time': 'Ascending', 'Year': 'Descending'}`... which is going to be a different iteration order - thus different sort results. – Jon Clements May 06 '19 at 12:33
  • You also don't really need to make that Series... assuming the order of the dict is correct... `df.sort_values(list(sort_dict), ascending=[s == 'Ascending' for s in sort_dict.values()])` is fine – Jon Clements May 06 '19 at 12:35
  • Thanks for the warning @JonClements. How would you allow the user to specify order without having to specify in `.sort_values` call? – Izak Joubert May 06 '19 at 12:36
  • @Zack You could use an `OrderedDict` so that iterating over it preserves insertion order, but building that the user is likely to have to use 2-tuples anyway... so I'm not quite sure what advantage there is here your general approach. – Jon Clements May 06 '19 at 12:38
  • Is there no way to lock the dictionary in it's current order? – Izak Joubert May 06 '19 at 12:41
  • @ZackJoubert - If need version independent solution with `dict`, then not. But `OrderedDict` or `Series` working nice. – jezrael May 06 '19 at 12:42
  • @Zack yes - using an `OrderedDict`... but there's no way to "lock" a dictionary in it's "current order" - because it *doesn't have order*... :) – Jon Clements May 06 '19 at 12:42
  • These dictionaries are anarchists :D. Thanks guys, I'll implement [this](https://stackoverflow.com/questions/15711755/converting-dict-to-ordereddict) solution on the user end and convert it to an ordered dictionary in the code – Izak Joubert May 06 '19 at 12:45