Pandas- Create a new column filled with the number of observations in another column

Question

I have a DataFrame object df. One of the column values in df is ID There are many rows with the same ID.

I want to create a new columns num_totals that counts the number of observation for each ID. For example, something like this:

ID | Num Totals
1  |    3
1  |    3
1  |    3
2  |    2
2  |    2
3  |    3
3  |    3
3  |    3
4  |    1

What's the fastest way to do this in Pandas?

score 5 · Accepted Answer · answered Oct 09 '13 at 07:52

5

A simple groupby+transform would work:

df['num_totals'] = df.groupby('ID').transform('count')

answered Oct 09 '13 at 07:52

Rutger Kassies

Got it. Brand new to PANDAS, and I wasn't sure if that would add the totals to each row in ID or just one of them. – Parseltongue Oct 09 '13 at 07:53
If you would use `.count()` or `.agg('count')` instead of `transform()` it would indeed collapse to the amount of unique ID's. But transform reshapes the result to the original dimension. – Rutger Kassies Oct 09 '13 at 07:56
Got it, then transform is what I need. Thanks! – Parseltongue Oct 09 '13 at 07:57

1 Answers1