1

I have a dataframe:

data = {
    'group': ['2', '1', '2', '2', '2', '1'],
    'interval': ['20-30', '20-30', '30-40', '10-20', '10-20', '0-10'],
    'count': [3, 4, 2, 7, 5, 1],
}
df = pd.DataFrame(data)
group   interval    count
2          20-30    3
1          20-30    4
2          30-40    2
2          10-20    7
2          10-20    5
1          0-10     1

I want to sort group first then interval simultaneously in ascending order which will look like this:

group   interval    count
1       00-10   1
1       20-30   4
2       10-20   5
2       10-20   7
2       20-30   3
2       30-40   2

I know how to do this separately but how to do it simultaneously?

(
    df
    .sort_values(by = ['group'], key = lambda s: s.str[0:].astype(int))
    .sort_values(by = ['interval'], key = lambda s: s.str[:2].astype(int))
    .reset_index(drop=True)
)
Barmar
  • 741,623
  • 53
  • 500
  • 612
shsh
  • 684
  • 1
  • 7
  • 18

1 Answers1

2

If you want to use different custom keys in the same sort_values call, you could use a dictionary:

keys = {'group': lambda s: s.str[0:].astype(int),
        'interval': lambda s: s.str[:2].astype(int)
       }

df.sort_values(by=['group', 'interval'],
               key=lambda x: keys[x.name](x),
               ignore_index=True)

Or with a custom function:

def cust_sort(s):
    match s.name:
        case 'group':
            return s.str[0:].astype(int)
        case 'interval':
            return s.str[:2].astype(int)
        case _:
            return s

df.sort_values(by=['group', 'interval'],
               key=cust_sort,
               ignore_index=True)

Output:

  group interval  count
0     1    00-10      1
1     1    20-30      4
2     2    10-20      7
3     2    10-20      5
4     2    20-30      3
5     2    30-40      2
mozway
  • 194,879
  • 13
  • 39
  • 75