1

I have a dataframe with a datetime column where the times are quite randomly spaced. They are usually spaced with a frequency of ~1 min, but sometimes there is a gap for a day or more (but no empty rows). I need to add these rows (with a frequency of 1 minute) where the diff of the datetime column is larger than a certain value.

I can only find answers rewriting the index (with reindex), but I can't do that as the points aren't evenly spaced.

simonblaha
  • 11
  • 1
  • did you check resample – Atanas Atanasov Feb 25 '23 at 20:20
  • "where the diff of the datetime column is larger than a certain value." - this is not very clear. Please [edit] your post and include a small sample of your dataframe (you can run something like `df.head(10).to_dict()` and paste the results; also see [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391)), as well as the desired output and your attempted solution. Btw, welcome to Stack Overflow. Please take the [tour](https://stackoverflow.com/tour) and check out [how to ask good questions](https://stackoverflow.com/help/how-to-ask). – AlexK Feb 25 '23 at 22:19

1 Answers1

0

Let's say you have this dataframe

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'datetime': pd.date_range(start='2022-01-01', end='2022-01-03', freq='1min')
})

which has 3 days with every row being a minute apart. Now. let's introduce randomly distributed gaps of one hour:

df = pd.concat([df.iloc[:3], pd.DataFrame({'datetime': pd.date_range(start='2022-01-03 06:00:00', end='2022-01-03 12:00:00', freq='H')}).iloc[1:], df.iloc[3:]])

to fill these gaps, you need to resample with the frequency you were after:

df = df.set_index('datetime').resample('1min').ffill().reset_index()