Let us suppose I have the following dataframe:
data = {'measure_id': ['0', '1', '2', '3', '4'],
'measure_time': ['2019-11-22 22:30:08.559000', '2019-11-22 22:36:09.149000', '2019-11-22 22:36:09.149000', '2019-11-22 22:40:09.261000', '2019-11-22 22:46:10.011000'],
'pressure': [10, 9, 3, 11, 12]}
df = pd.DataFrame(data, columns=['measure_id', 'measure_time', 'pressure'])
df.set_index('measure_id', inplace=True)
print('df:\n', df)
I need to compute for each row the time rolling sum of all precedent rows, that is where the difference:
measure_time of current row - measure_time of precedent row
is inferior to some value (let us say 2 minutes for example), the current row included in the sum. This would give here the column rolling_sum:
data = {'measure_id': ['0', '1', '2', '3', '4'],
'measure_time': ['2019-11-22 22:30:08.559000', '2019-11-22 22:36:09.149000', '2019-11-22 22:36:09.149000', '2019-11-22 22:40:09.261000', '2019-11-22 22:46:10.011000'],
'pressure': [10, 9, 3, 11, 12],
'rolling_sum':[10, 9, 12, 11, 12]}
df = pd.DataFrame(data, columns=['measure_id', 'measure_time', 'pressure', 'rolling_sum'])
df.set_index('measure_id', inplace=True)
print('df:\n', df)
It seems now Pandas can handle rolling with offset Pandas: rolling mean by time interval , but I have not been able to make this solution work here (probably because of the non-time index).