How to sort column by value counts in pandas

Question

I want to sort a dataframe in pandas. I wanna do it by sorting 2 columns by value counts. One depends on the other. As seen in the image, I have achieved categorical sorting. However, I want the column 'category' to be sorted by value counts. And then the dataframe is to be sorted again based on 'beneficiary_name' under the same category.

This is the code I have written to achieve this till now.

data_category = data_category.sort_values(by=['category','beneficiary_name'], ascending=False)

Please help me figure this out. Thanks.

Could you please provide a dataframe in your question? See [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for help — Stef, Dec 08 '21 at 15:42
assign the counts with `df['counts'] = df.groupby('category')['data'].transform('size')`. Then you can sort on `counts` and `beneficiary_name`. — Quang Hoang, Dec 08 '21 at 15:45
@QuangHoang This works! but how do I sort it ascending only for beneficiary_name. When I try ascending true, it does it for counts as well. — saucypanda, Dec 08 '21 at 16:01
thank you so much! Do you wanna post this ans? So I can accept it as the correct one. — saucypanda, Dec 08 '21 at 16:04
@saucypanda Hello, my answer below was downvoted without explication, which leaves me to wonder whether I've misunderstood the question? Could you please clarify? — Stef, Dec 15 '21 at 11:40

score -1 · Answer 1 · answered Dec 08 '21 at 15:48

Inspired by this related question:

Create column of value_counts in Pandas dataframe

import pandas as pd

df = pd.DataFrame({'id': range(9), 'cat': list('ababaacdc'), 'benef': list('uuuuiiiii')})

print(df)
#    id cat benef
# 0   0   a     u
# 1   1   b     u
# 2   2   a     u
# 3   3   b     u
# 4   4   a     i
# 5   5   a     i
# 6   6   c     i
# 7   7   d     i
# 8   8   c     i

df['cat_count'] = df.groupby(['cat'])['id'].transform('count')

print(df)
#    id cat benef  cat_count
# 0   0   a     u          4
# 1   1   b     u          2
# 2   2   a     u          4
# 3   3   b     u          2
# 4   4   a     i          4
# 5   5   a     i          4
# 6   6   c     i          2
# 7   7   d     i          1
# 8   8   c     i          2

df = df.sort_values(by=['cat_count', 'cat', 'benef'], ascending=False)

print(df)
#    id cat benef  cat_count
# 0   0   a     u          4
# 2   2   a     u          4
# 4   4   a     i          4
# 5   5   a     i          4
# 6   6   c     i          2
# 8   8   c     i          2
# 1   1   b     u          2
# 3   3   b     u          2
# 7   7   d     i          1

How to sort column by value counts in pandas

1 Answers1