Python Pandas : Conditional rolling count

Question

Here is my input dataframe:

type
a   
a   
a   
a   
a   
b   
b   
a   
a   
a

This is my expected output:

type,   id
a   ,   1
a   ,   2
a   ,   3
a   ,   4
a   ,   5
b   ,   5
b   ,   5
a   ,   6
a   ,   7
a   ,   8

I need to generate ID column based on 'type' column. I have two types 'a' & 'b'.. as long as it is 'a' I want to increment ID. If 'b', keep previous 'a' ID. How can I do this in a Pandas dataframe?

Possible duplicate: https://stackoverflow.com/questions/25119524/pandas-conditional-rolling-count — petezurich, Dec 04 '18 at 12:23
@petezurich, There should be a duplicate somewhere, but I don't think that's the one. — jpp, Dec 04 '18 at 12:32
I'am wondering how this question fall under `too broad` category? — Mohamed Thasin ah, Dec 04 '18 at 12:40
@MohamedThasinah, I think it's groupthink (the first guy chose it, others follow). This is *probably* a duplicate, I just couldn't find it. It could be no MCVE too, but not too broad. — jpp, Dec 04 '18 at 12:42

score 6 · Accepted Answer · answered Dec 04 '18 at 12:13

6

You can count the cumulative sum of a Boolean series indicating when your series equals a value:

df['id'] = df['type'].eq('a').cumsum()

answered Dec 04 '18 at 12:13

jpp

159,742
34
281
339

what happens to type 'b' ? – JackJack Dec 04 '18 at 12:31
Where you have type `b`, `df['type'].eq('a')` gives `False`, which is equal to `0`, `cumsum` therefore works correctly. – jpp Dec 04 '18 at 12:31

score 3 · Answer 2 · answered Dec 04 '18 at 12:39

I tried this way, Obviously @jpp answer is coolest one. But I approached like this just to give an idea.

df=pd.DataFrame({'col1':['a','a','a','a','a','b','b','a','a','a']})
df['type']= df.groupby('col1').cumcount()+1
df.loc[df['col1']=='b','type']=np.NaN
df['type']=df['type'].ffill()
print df

O/P

  col1  type
0    a   1.0
1    a   2.0
2    a   3.0
3    a   4.0
4    a   5.0
5    b   5.0
6    b   5.0
7    a   6.0
8    a   7.0
9    a   8.0

score 0 · Answer 3 · answered May 06 '19 at 18:04

0

If your DataFrame is df:

df[df=='a'].expanding().count()

answered May 06 '19 at 18:04

r_hudson

193
8

Python Pandas : Conditional rolling count

3 Answers3