14

Say I have a dataframe with several timestamps and values. I would like to measure Δ values / Δt every 2.5 seconds. Does Pandas provide any utilities for time differentiation?

                              time_stamp   values
19492   2014-10-06 17:59:40.016000-04:00  1832128                                
167106  2014-10-06 17:59:41.771000-04:00  2671048                                
202511  2014-10-06 17:59:43.001000-04:00  2019434                                
161457  2014-10-06 17:59:44.792000-04:00  1294051                                
203944  2014-10-06 17:59:48.741000-04:00   867856
Amelio Vazquez-Reina
  • 91,494
  • 132
  • 359
  • 564

2 Answers2

10

It most certainly does. First, you'll need to convert your indices into pandas date_rangeformat and then use the custom offset functions available to series/dataframes indexed with that class. Helpful documentation here. Read more here about offset aliases.

This code should resample your data to 2.5s intervals

#df is your dataframe
index = pd.date_range(df['time_stamp'])
values = pd.Series(df.values, index=index)

#Read above link about the different Offset Aliases, S=Seconds
resampled_values = values.resample('2.5S') 

resampled_values.diff() #compute the difference between each point!

That should do it.

tyleha
  • 3,319
  • 1
  • 17
  • 25
  • Thanks @tyleha. I am hoping to use this precise solution with the dataframe above. Here is my attempt: http://stackoverflow.com/questions/26247301/pandas-resampling-values-within-time-window-until-now I need the sampling to be causal, but it isn't. – Amelio Vazquez-Reina Oct 08 '14 at 00:28
  • Is `df.diff().diff()` equivalent to finding the 2nd derivative (approximately)? – LondonRob Mar 24 '16 at 15:06
  • Sure. But you'd better know your Δt to keep your units useful. 2.5^2 s^2 would be whack. – tyleha Mar 24 '16 at 21:04
  • Does this apply when there are multiple columns in your data frames? All column data was sampled under the same rate. – bud Oct 21 '16 at 16:48
3

If you really want the time derivative, then you also need to divide by the time difference (delta time, dt) since last sample

An example:

dti = pd.DatetimeIndex([
    '2018-01-01 00:00:00',
    '2018-01-01 00:00:02',
    '2018-01-01 00:00:03'])

X = pd.DataFrame({'data': [1,3,4]}, index=dti)

X.head()
                    data
2018-01-01 00:00:00 1
2018-01-01 00:00:02 3
2018-01-01 00:00:03 4

You can find the time delta by using the diff() on the DatetimeIndex. This gives you a series of type Time Deltas. You only need the values in seconds, though

dt = pd.Series(df.index).diff().dt.seconds.values

dXdt = df.diff().div(dt, axis=0, )

dXdt.head()
                    data
2018-01-01 00:00:00 NaN
2018-01-01 00:00:02 1.0
2018-01-01 00:00:03 1.0

As you can see, this approach takes into account that there are two seconds between the first two values, and only one between the two last values. :)

ViggoTW
  • 696
  • 2
  • 7
  • 19
  • 1
    I think it should be pd.Series(df.index).diff().dt.total_seconds() (try with time with millisecond resolution) – scls Jun 04 '19 at 20:32