Let's say I have a dataframe df
with one column called time
with a timestamp in seconds (and some others). This is basically to represent a time series, but with an irregular time resolution spacing. Now I'd like to extract rows such that the spacing is at least 5 seconds as I did in my example below. But I was wondering whether there is a more vectorized way to do this.
Is there a more elegant way that works without resorting to this rather verbose loop?
It doesn't matter if there is an offset at the start, and the 5 seconds are just an arbitrary number.
import pandas as pd
import numpy as np
N = 100
time = np.arange(0, N, 2)
time = time + np.random.random(len(time))
df = pd.DataFrame(time, columns=('time',)) # assume df has more than one columnj
print(df)
last = 0
mask = []
for i in range(len(df)):
if df['time'][i] > last + 5: # find first entry after at least 5 seconds
last = df['time'][i]
mask.append(True)
else:
mask.append(False)
print(df.loc[mask])