You can fill in your missing times using pandas.DataFrame.resample
. The caveat is that the dataframe needs to have a pandas.DateTimeIndex
. In your case the time is likely stored as a float in seconds since epoch, and this needs to be converted prior to re-sampling. Here is a function which will perform that operation.
Code:
import datetime as dt
import pandas as pd
def resample(dataframe, time_column, sample_period):
# make a copy of the dataframe
dataframe = dataframe.copy()
# convert epoch times to datetime
dataframe.time = dataframe.time.apply(
lambda ts: dt.datetime.fromtimestamp(ts))
# make the datetimes into an index
dataframe.set_index(time_column, inplace=True)
# resample to desired period
dataframe = dataframe.resample(sample_period).asfreq().reset_index()
# convert datetimes back to epoch
epoch = dt.datetime.fromtimestamp(0)
dataframe.time = dataframe.time.apply(
lambda ts: (ts - epoch).total_seconds())
return dataframe
Test Code:
values = [
(1488771900.10, 'a'),
(1488771900.20, 'b'),
(1488771900.30, 'c'),
(1488771900.60, 'f'),
]
columns = ['time', 'value']
df = pd.DataFrame(values, columns=columns)
print(df)
new_df = resample(df, 'time', '100ms')
print(new_df)
Results:
time value
0 1.488772e+09 a
1 1.488772e+09 b
2 1.488772e+09 c
3 1.488772e+09 f
time value
0 1.488772e+09 a
1 1.488772e+09 b
2 1.488772e+09 c
3 1.488772e+09 NaN
4 1.488772e+09 NaN
5 1.488772e+09 f