So it seems to me that you're really trying to create a Pandas: conditional rolling count.
Except when the rain
value is 1
you want zeros and when the rain
value is 0
you want the rolling count.
To do this, you can get the rolling total for all groups in rain
, then go back through the DataFrame and set the hours_since_rainfall
values to 0
where rain
is 1
.
import pandas as pd
df = pd.DataFrame({
'date': {
0: '2005-10-15 16:00:00', 1: '2005-10-15 17:00:00',
2: '2005-10-15 18:00:00', 3: '2005-10-15 19:00:00',
4: '2005-10-15 20:00:00', 5: '2005-10-15 21:00:00',
6: '2005-10-15 22:00:00', 7: '2005-10-15 23:00:00',
8: '2005-10-16 00:00:00', 9: '2005-10-16 01:00:00'
},
'rain': {
0: 1, 1: 1, 2: 0, 3: 0, 4: 0, 5: 0, 6: 1, 7: 1, 8: 0, 9: 0
}
})
df['hours_since_rainfall'] = df.groupby([(df['rain'] != df['rain'].shift()).cumsum()]) \
.cumcount() + 1
df.loc[df['rain'] == 1, 'hours_since_rainfall'] = 0
print(df)
Alternatively, you can conditionally transform
based on the rain
value.
If the first value in the group is a 0 (all of the values in the group are 0), you can replace the series with the range
from 1
to len + 1
.
If the first value in the group is a 1 (all the values in the group are 1), you can replace it with [0] * len
or, basically, an equal length series of zeros.
import pandas as pd
df = pd.DataFrame({
'date': {
0: '2005-10-15 16:00:00', 1: '2005-10-15 17:00:00',
2: '2005-10-15 18:00:00', 3: '2005-10-15 19:00:00',
4: '2005-10-15 20:00:00', 5: '2005-10-15 21:00:00',
6: '2005-10-15 22:00:00', 7: '2005-10-15 23:00:00',
8: '2005-10-16 00:00:00', 9: '2005-10-16 01:00:00'
},
'rain': {
0: 1, 1: 1, 2: 0, 3: 0, 4: 0, 5: 0, 6: 1, 7: 1, 8: 0, 9: 0
}
})
df['hours_since_rainfall'] = df \
.groupby([(df['rain'] != df['rain'].shift()).cumsum()])['rain'] \
.transform(lambda x: range(1, len(x) + 1) if x.iloc[0] == 0 else [0] * len(x))
print(df)
Both Produce:
date rain hours_since_rainfall
0 2005-10-15 16:00:00 1 0
1 2005-10-15 17:00:00 1 0
2 2005-10-15 18:00:00 0 1
3 2005-10-15 19:00:00 0 2
4 2005-10-15 20:00:00 0 3
5 2005-10-15 21:00:00 0 4
6 2005-10-15 22:00:00 1 0
7 2005-10-15 23:00:00 1 0
8 2005-10-16 00:00:00 0 1
9 2005-10-16 01:00:00 0 2
As Dict:
{'date': {0: '2005-10-15 16:00:00', 1: '2005-10-15 17:00:00',
2: '2005-10-15 18:00:00', 3: '2005-10-15 19:00:00',
4: '2005-10-15 20:00:00', 5: '2005-10-15 21:00:00',
6: '2005-10-15 22:00:00', 7: '2005-10-15 23:00:00',
8: '2005-10-16 00:00:00', 9: '2005-10-16 01:00:00'},
'rain': {0: 1, 1: 1, 2: 0, 3: 0, 4: 0,
5: 0, 6: 1, 7: 1, 8: 0, 9: 0},
'hours_since_rainfall': {0: 0, 1: 0, 2: 1, 3: 2, 4: 3,
5: 4, 6: 0, 7: 0, 8: 1, 9: 2}}