Groupby and count() with alias and 'normal' dataframe: python pandas versus mssql

Question

Coming from a SQL environment, I am learning some things in Python Pandas. I have a question regarding grouping and aggregates.

Say I group a dataset by Age Category and count the different categories. In MSSQL I would write this:

SELECT AgeCategory, COUNT(*) AS Cnt
FROM TableA
GROUP BY AgeCategory
ORDER BY 1

The result set is a 'normal' table with two columns, the second column I named Count.

When I want to do the equivalent in Pandas, the groupby object is different in format. So now I have to reset the index and rename the column in a following line. My code would look like this:

grouped = df.groupby('AgeCategory')['ColA'].count().reset_index()
grouped.columns = ['AgeCategory', 'Count']
grouped

My question is if this can be accomplished in one go. Seems like I am over-doing it, but I lack experience.

Thanks for any advise.

Regards, M.

Possible duplicate of [Pandas create new column with count from groupby](https://stackoverflow.com/questions/29836477/pandas-create-new-column-with-count-from-groupby) — AJS, Apr 05 '19 at 09:02
Related: https://stackoverflow.com/a/55435185, titled "**Count ALL Rows per Group (Series/DataFrame): GroupBy.size**" (getting count in groupby, count v/s size... scroll down) — cs95, Apr 05 '19 at 09:44

jezrael · Accepted Answer · 2019-04-05T08:58:06.277

2

Use parameter name in DataFrame.reset_index:

grouped = df.groupby('AgeCategory')['ColA'].count().reset_index(name='Count')

Or:

grouped = df.groupby('AgeCategory').size().reset_index(name='Count')

Difference is GroupBy.count exclude missing values, GroupBy.size not.

More information about aggregation in pandas.

edited Apr 05 '19 at 08:58

answered Apr 05 '19 at 08:54

jezrael

822,522
95
1,334
1,252

Groupby and count() with alias and 'normal' dataframe: python pandas versus mssql

1 Answers1