I am trying to return a cumulative count
of the amount of times there's a change of values in a column
.
So for the df
below, I want to return a running count of the times that 'Home'
changes to 'Away'
and vice versa. I don't want to return the amount of times a value is displayed.
import pandas as pd
d = ({
'Who' : ['Home','Away','','','Home','Away','Home','Home','Home','','Away','Home'],
})
df = pd.DataFrame(data = d)
I tried this.
df['Home_count'] = (df['Who'] == 'Home').cumsum()
df['Away_count'] = (df['Who'] == 'Away').cumsum()
Which returns:
Who Home_count Away_count
0 Home 1 0
1 Away 1 1
2 1 1
3 1 1
4 Home 2 1
5 Away 2 2
6 Home 3 2
7 Home 4 2
8 Home 5 2
9 5 2
10 Away 5 3
11 Home 6 3
But I'm trying to count the amount of times it changes. Not the total count of each value. So if it reads Home, Home, Home, Away, there should only be a count next to Away. Not 1,2,3 against Home.
Home 1 #Theres a change so provide a count
Home #No change so no count
Home #No change so no count
Away 1 #Theres a change so provide a count
Home 2 #Theres a change so provide a count
Please refer to intended output:
Intended output:
Count_Away Count_Home Who
0 1 Home
1 1 Away
2
3
4 2 Home
5 2 Away
6 3 Home
7 Home
8 Home
9
10 3 Away
11 4 Home