65

I'm looking for a simple way to sort a pandas dataframe by the absolute value of a particular column, but without actually changing the values within the dataframe. Something similar to sorted(df, key=abs). So if I had a dataframe like:

    a   b
0   1   -3
1   2   5 
2   3   -1
3   4   2
4   5   -9

The resultant sorted data when sorting on 'b' would look like:

    a   b
2   3   -1
3   4   2
0   1   -3
1   2   5 
4   5   -9
cs95
  • 379,657
  • 97
  • 704
  • 746
afinit
  • 985
  • 1
  • 9
  • 16

4 Answers4

69

UPDATE

Since 0.17.0 order and sort have been deprecated (thanks @Ruggero Turra), you can use sort_values to achieve this now:

In[16]:

df.reindex(df.b.abs().sort_values().index)
Out[16]: 
   a  b
2  3 -1
3  4  2
0  1 -3
1  2  5
4  5 -9
cs95
  • 379,657
  • 97
  • 704
  • 746
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • I'd love to see a version of this that can do this across multiple indices, so that, e.g., if index1 can be either "Fruit" or "Vegetables", and index2 is any fruit or vegetable, and the values are the total spent on that item, it would be possible to sort the data frame by absolute value of the cost of the type of food (index1), and *then*, internally, by the cost of each type of fruit or vegetable. – HaPsantran Jul 26 '16 at 20:12
  • 10
    just for everyone who is here and is searching for the values with the biggest abs listet first, change the code to: `df.reindex(df.b.abs().sort_values(ascending=False).index)` – PV8 Apr 16 '20 at 10:37
  • An add-on question: what if I want to sort by multiple columns and only use abs() for one of them for sorting. Can I modify this code to make it work? – Bowen Liu Jan 24 '23 at 23:38
69

Perfect Simple Solution with Pandas >= V_1.1.0:

Use the parameter key in the sort_values() function:

import pandas as pd
df = pd.DataFrame({'A': ['a', 'b', 'c', 'd', 'e', 'f'], 'B': [-3, -2, -1, 0, 1, 2]})

df.sort_values(by='B', key=abs)

will yield:

    A   B
3   d   0
2   c   -1
4   e   1
1   b   -2
5   f   2
0   a   -3
Lucecpkn
  • 971
  • 6
  • 9
  • 2
    Had not noticed the `key` kwarg until seeing this answer! – Shan Dou Apr 23 '21 at 06:19
  • `ttt.sort_values(by='B', key=pd.Series.abs)` also works, perhaps as well with older versions of Pandas? – Cactus Philosopher Jul 14 '21 at 17:59
  • @CactusPhilosopher: Not sure about the older versions, because according to [the doc](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html), the `key` arg is "new in version 1.1.0.", [released on July 28, 2020](https://pandas.pydata.org/pandas-docs/dev/whatsnew/v1.1.0.html). – Lucecpkn Jul 15 '21 at 16:10
  • An add-on question: what if I want to sort by multiple columns and only use abs() for one of them for sorting. Can I modify this code to make it work? – Bowen Liu Jan 24 '23 at 23:38
  • @BowenLiu : I can only think of another way: ```df['B_abs'] = df['B'].abs()``` and then ```df.sort_values(['B_abs', 'A'])```. – Lucecpkn Feb 02 '23 at 16:08
18

Towards more idiomatic pandas: Use argsort

A cleaner approach would be to call Series.argsort on the absolute values, and then index:

df.iloc[df['b'].abs().argsort()]

   a  b
2  3 -1
3  4  2
0  1 -3
1  2  5
4  5 -9

If you need to reset the index, use Series.reset_index,

df.iloc[df['b'].abs().argsort()].reset_index(drop=True)

   a  b
0  3 -1
1  4  2
2  1 -3
3  2  5
4  5 -9

Lastly, since argsort does not have an ascending parameter to specify ascending/descending order, you will need to negate df['b'].abs() to sort by descending order.

df.iloc[(-df['b'].abs()).argsort()]

   a  b
4  5 -9
1  2  5
0  1 -3
3  4  2
2  3 -1

You can do this with NumPy as well—use np.abs and ndarray.argsort.

df.iloc[np.abs(df['b'].values).argsort()]

   a  b
2  3 -1
3  4  2
0  1 -3
1  2  5
4  5 -9

Or, for descending order,

df.iloc[(-np.abs(df['b'].values)).argsort()]

   a  b
4  5 -9
1  2  5
0  1 -3
3  4  2
2  3 -1
cs95
  • 379,657
  • 97
  • 704
  • 746
0

For series: series_name.sort_values(key=abs)

Then incorporating this with the dataframe : df.reindex(series_name.sort_values(key=abs).index)

abir
  • 1
  • 2
  • how does this add to the existing answer? – C8H10N4O2 Oct 27 '22 at 11:33
  • @C8H10N4O2 instead of thinking of the whole dataframe, just sort the target series / column of the dataframe and then re-index the rest of the columns accordingly. – abir Oct 28 '22 at 16:15