Python Pandas Conditional Calculation row by row

Question

I have a dataset with values in a column A representing a score. I want to calculate a column B where I can see for how long the score is equal or higher than in the last row. If the value is lower then I will set the value in column B for this row to 0.

I tried the following:

df = pd.DataFrame({'A': [140, 145,148, 150, 100, 105, 106]})

df['B'] = 0
df.loc[df['A'].diff() >= 0, 'B'] = df['B'].shift() + 1

The result is the following

    A       B
0   140     0.0
1   145     1.0
2   148     1.0
3   150     1.0
4   100     0.0
5   105     1.0
6   106     1.0

So I understand that the condition is checked for each row - but unfortunately it seems to be calculated all together, so that it does not increment value B as expected because at the time of calculation for each row the value of the row-1 .shift() is still 0.

How do I have to do it to get the following result?:

    A       B
0   140     0
1   145     1
2   148     2
3   150     3
4   100     0
5   105     1
6   106     2

Does this answer your question? [Python pandas cumsum with reset everytime there is a 0](https://stackoverflow.com/questions/45964740/python-pandas-cumsum-with-reset-everytime-there-is-a-0) — G. Anderson, Feb 28 '20 at 16:59
I came up with solution similar to @G.Anderson's link. So, this question qualifies for a dup: `m = df.A.diff().ge(0); df['B'] = m.cumsum() - m.cumsum().where(~m).ffill()` — Andy L., Feb 28 '20 at 17:56

score 3 · Accepted Answer · answered Feb 28 '20 at 16:56

3

You need cumsum to identify the blocks, and groupby().cumcount() to enumerate each the block:

s = df['A'].diff().ge(0)
df['B'] = s.groupby((~s).cumsum()).cumcount()

Output:

answered Feb 28 '20 at 16:56

Quang Hoang

146,074
10
56
74

Thank you so much! This is exactly what I was looking for! – fleshstorm Feb 29 '20 at 09:45

Python Pandas Conditional Calculation row by row

1 Answers1