I have a python dataframe with hourly values for Jan 2015 except some hours are missing the index and values both. Ideally the dataframe with columns named "dates" and "values" should have 744 rows in it. However, it has randomly missing 10 hours and hence has only 734 rows. I want to interpolate for missing hours in the month to create the desired dataframe with 744 "dates" and 744 "values".
Edit:
I am new to python so I am struggling with implementing this idea:
- Create a dataframe with first column as all hours in Jan 2015
- Create the second column of same size as first of NANs
- Fill the second column with available values hence the missing hours have NANs in them
- Use the panda interpolate funtion
Edit2:
I was looking for hint for code snippets. Based on suggestion below I was able to create the following code but it fails to fill in the values which are zeros at the start of the month i.e. for hours 1 through 5 on Jan 1.
import panda as pd
st_dt = '2015-01-01'
en_dt = '2015-01-31'
DateTimeHour = pd.date_range( pd.Timestamp( st_dt ).date(), pd.Timestamp(
en_dt ).date(), freq='H')
Pwr.index = pd.DatetimeIndex(Pwr.index) #Pwr is the original dataframe
Pwr = Pwr.reindex( DateTimeHour, fill_value = 0 )
Pwr2 = pd.Series( Pwr.values )
Pwr2.interpolate( imit_direction='both' )