Replace each element in pandas Series by how often it previously occured

Asked Sep 15 '20 at 10:01

Active Sep 15 '20 at 14:01

Viewed 12 times

Assume there is a pandas Series containing values reflecting categories ('a','b','c' or 1,2,3):

pds = I.pd.Series(['a','a','b','c','c','c','a'])

I would like to generate a new Series, which indicates how often each element has already occured, i.e. the expected output would be:

pds_result = I.pd.Series([0,1,0,0,1,2,2])
                     #    ^ no 'a' prior to this position in pds
                     #      ^ one 'a' prior to this position in pds
                     #        ^ no 'b' prior to this position in pds
                     #                ^ two 'a' prior to this position in pds

How can this be achieved in a concise manner?

edited Sep 15 '20 at 14:01

asked Sep 15 '20 at 10:01

Arco Bast

3,595
2
26
53

`I am stuck with pandas 19.2` ? Are you sure? [link](https://pandas.pydata.org/pandas-docs/version/0.19.0/generated/pandas.core.groupby.GroupBy.cumcount.html) – jezrael Sep 15 '20 at 10:11
https://pandas.pydata.org/pandas-docs/version/0.15.0/generated/pandas.core.groupby.GroupBy.cumcount.html – jezrael Sep 15 '20 at 10:12
1

you are right, ofc. So I would do `pds.groupby(pds).cumcount()` right? ... a bit weird but works. – Arco Bast Sep 15 '20 at 10:17

Replace each element in pandas Series by how often it previously occured

0 Answers0