How does one count the number of previous rows in a dataframe that contain the same value as a cell in that row?
Given a dataframe, e.g.:
In [1]: df1 = pd.DataFrame({'lkey': ['foo', 'bar', 'baz', 'foo', 'foo', 'bar', 'baz', 'foo', 'foo']})
In [2]: df1
Out[2]:
lkey
0 foo
1 bar
2 baz
3 foo
4 foo
5 bar
6 baz
7 foo
8 foo
I would like to add a column which contains the number of times the value in lkey
for that row appears in lkey
in all previous rows of the dataframe.
I have a dataframe of shape roughly 100000 x 15. My attempt at a for loop was useless roll eyes.
The desired output would produce:
In [2]: df1['lkeyCount'] = (number of times lkey appears in previous rows in lkey column)
Out[2]:
lkey lkeyCount
0 foo 0
1 bar 0
2 baz 0
3 foo 1
4 foo 2
5 bar 1
6 baz 1
7 foo 3
8 foo 4
Thanks in advance!