Adding a series filled with Rolling count, sum or mean

Question

Stimulated by another post, my story is: I have this df

   col
0  B
1  B
2  A
3  A
4  A
5  B

and i need this output

   col col_frequencies
0  B   1
1  B   2
2  A   1 
3  A   2
4  A   3
5  B   3

# Value in row 5 is the update of that in row 2. I do not want the counter of frequencies be resetted

Something like a countif in excel.

Thanks in advance from a total beginner, G.

score 0 · Answer 1 · answered May 13 '20 at 18:05

0

you can use the value_count function of pandas, to get the frequency of any data point.

answered May 13 '20 at 18:05

Shubh Patni

468
1
4
7

score 0 · Accepted Answer · answered May 13 '20 at 18:05

You can do this in two stages:

Group all rows with same col value. This can be done using groupby().
Get index of each row in the new group. You do this with cumcount() (which start from zero, so you want to add +1 to it)

All in one:

df['col_frequencies'] = df.groupby(['col']).cumcount()+1;

for example (sorry for laziness in columns name)

import pandas as pd

df = pd.DataFrame(['B', 'B', 'A', 'A', 'A', 'B'])
print(df)
df['Col'] = df.groupby([0]).cumcount()+1;

output:

score 0 · Answer 3 · answered May 13 '20 at 18:12

This should solve your problem:-

Let say your data frame name is df.

res = {}
r = []
for i, row in df.iterrows():
    if row['col'] in res:
        res[row['col']] += 1
        r.append(res[row['col']])
    else:
        res[row['col']] = 1
        r.append(res[row['col']])

df['col_frequencies'] = r

The output will be:-

   col col_frequencies
0  B   1
1  B   2
2  A   1 
3  A   2
4  A   3
5  B   3

Adding a series filled with Rolling count, sum or mean

3 Answers3