I´m working an oil wells database and I need to get a column counting months since first production.
Note that I need for every natural month the counting value (for instance to compare every well production level in its second month in production) so the solutions with groupby/agg are not useful in this case.
I ´ve doubts on how to achieve the desired month_count column
id | year | month | month_count |
---|---|---|---|
A | 2022 | 6 | 1 |
A | 2022 | 7 | 2 |
A | 2022 | 8 | 3 |
A | 2022 | 9 | 4 |
B | 2022 | 8 | 1 |
B | 2022 | 9 | 2 |
C | 2022 | 5 | 1 |
C | 2022 | 6 | 2 |
C | 2022 | 7 | 3 |
C | 2022 | 8 | 4 |
C | 2022 | 9 | 5 |
Here´s the code, including the desired output (month_count) column
import pandas as pd
data = {
'id' : ['A','A','A','A','B','B','C','C','C','C','C'],
'year' : [2022,2022,2022,2022,2022,2022,2022,2022,2022,2022,2022],
'month' : [6,7,8,9,8,9,5,6,7,8,9],
'month_count' : [1,2,3,4,1,2,1,2,3,4,5],
}
df = pd.DataFrame(data)
Which solution would be the best in your opinion?