Say I have a Pandas DataFrame in Python
df = pd.DataFrame(
{
"A": [ 1 , 2 , 2 , 1 , 3 , 1 ],
"B": ["a","b","c","d","e","f"],
}
)
I'd like to get a series, Count
, in which the m-th element corresponds to how many times the m-th element of df["A"]
has appeared in df["A"][0:m]
. Or, equivalently, a series with the ammount of times that a given term has appeared before. So, in our example, the desired result would be
0 0
1 0
2 1
3 1
4 0
5 2
Name: Count, dtype: float64
One way to achieve this is to use while
to create a series by adding elements one by one using (df.A.[0:i] == df.iloc[i]["A"]).sum()
for an index i
going from 0 to the length of df
.
The thing is that I don't know if this is a feature that already comes with Pandas DataFrames. I know about the existence of df['Count'] = df.groupby('A')['A'].transform('count')["Count"]
which outputs
0 3
1 2
2 2
3 3
4 1
5 3
Name: Count, dtype: float64
that is, the total ammount of times an element appears in the whole series: a similar result to what I want.
So, my question is: Are there ways of arriving at what I want to achieve that are simpler than the while
method mentioned and that, perhaps, resemble the latter method for counting the tottal number of appearences?