2

I have a dataframe df that looks like this:

ID Months Borrow_Rank
1 0 1
1 1 1
1 2 1
2 0 1
2 1 1
2 2 1
3 0 2
3 1 2
4 0 1
4 1 1

I want to create a new variable Months_Adjusted that starts counting from 0 for as long as Borrow_Rank remains the same.

ID Months Borrow_Rank Months_Adjusted
1 0 1 0
1 1 1 1
1 2 1 2
2 0 1 3
2 1 1 4
2 2 1 5
3 0 2 0
3 1 2 1
4 0 1 0
4 1 1 1

Thank you all and I apologise if I could have written the question better. This is my first post.

2 Answers2

1
import pandas as pd

df = pd.DataFrame({'Borrow_Rank':[1,1,1,1,1,1,1,2,2,2,2,2,3,3,3,1,1,1]})
selector = (df['Borrow_Rank'] != df['Borrow_Rank'].shift()).cumsum()
df['Months_Adjusted'] = df.groupby(selector).cumcount()
Vitalizzare
  • 4,496
  • 7
  • 13
  • 32
0

Here is a solution using itertools.groupby -

from itertools import groupby
df['Months_Adjusted'] = pd.concat(
          [pd.Series(range(len(list(g)))) 
           for k, g in groupby(df['Borrow_Rank'])],
ignore_index=True)

Output

   ID  Months  Borrow_Rank  Months_Adjusted
0   1       0            1                0
1   1       1            1                1
2   1       2            1                2
3   2       0            1                3
4   2       1            1                4
5   2       2            1                5
6   3       0            2                0
7   3       1            2                1
8   4       0            1                0
9   4       1            1                1
Mortz
  • 4,654
  • 1
  • 19
  • 35