This is a follow-up question to this other question: Causal resampling: Sum over the last X <time_unit>
Say I have the following time series:
money_spent
timestamp
2014-10-06 18:00:40.063000-04:00 0.568000
2014-10-06 18:00:41.361000-04:00 3.014770
2014-10-06 18:00:42.896000-04:00 0.878154
2014-10-06 18:00:43.040000-04:00 0.723077
2014-10-06 18:00:44.791000-04:00 0.723077
2014-10-06 18:00:45.496000-04:00 0.309539
2014-10-06 18:00:45.799000-04:00 3.032000
2014-10-06 18:00:47.470000-04:00 3.014770
2014-10-06 18:00:48.092000-04:00 1.584616
I would like to sample it:
- At pre-defined time points (e.g. a range of timestamps every
2.5
seconds starting from 18:00 until 19:00) - For every sample, get the sum of spend within the interval.
Update with example
For example, assuming that I generate a set of pre-defined timestamps as follows:
# Start at 18:00
start_time = datetime.datetime(year = 2014,
month = 10,
day = 6,
hour = 18,
tzinfo = pytz.timezone('US/Eastern')
# Finish 400 seconds later
end_time = start_time + datetime.timedelta(seconds=400)
my_new_timestamps = pd.date_range(start = start_time,
end = end_time,
freq = '2.5s')
I would like re-sample my original dataframe at the top of the post on the locations defined by my_new_timestamps
by getting the sum of money_spent
.
Note that the original dataframe only covers from ~18:00:40 until ~18:00:48, so if I do:
resample('2.5S', how='sum', label='right')
the command above will only return samples on the time-window between these two times, and not between the start and end times defined by my_new_timestamps
. It would also sample on 2.5s
intervals that are different from the ones I want (those defined by my_new_timestamps
).