1

I can successfully fill my new column with group counts, but I suspect there is a simpler way:

# How do I simplify this?

def f(gr):

    return pd.Series([gr['class_name'].count()] * gr.shape[0], index=gr.index)

df['class_size'] = df.groupby("class_name").apply(f).reset_index(level=0, drop=True)
column_list = ['class_name', 'class_size']
df[column_list].head(5)

Gets:

This is just the first few rows of data - see how the same class name has the same class count?

Dave Babbitt
  • 1,038
  • 11
  • 20

2 Answers2

1

I think you need transform:

df['class_size'] = df.groupby('class_name')['class_name'].transform('size')

Or:

df['class_size'] = df.groupby('class_name')['class_name'].transform('count')

What is the difference between size and count in pandas?

Graham
  • 7,431
  • 18
  • 59
  • 84
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

Depending on your DataFrame shape you can also just do a count on the groupby:

import pandas as pd
df = pd.DataFrame({'class names':list('abracadabra'),'class count':1})
df.groupby('class names').count().reset_index()
Sebastiaan
  • 1,166
  • 10
  • 18