2

I cannot find a solution for this problem. I would like to add future dates to a datetime indexed Pandas dataframe for model prediction purposes.

Here is where I am right now:

new_datetime = df2.index[-1:] # current end of datetime index
increment = '1 days' # string for increment - eventually will be in a for loop to add add'l days
new_datetime = new_datetime+pd.Timedelta(increment)

And this is where I am stuck. The append examples online only seem always seem to show examples with ignore_index=True , and in my case, I want to use the proper datetime indexing.

Clay
  • 21
  • 1
  • 2
  • could be worth a look: [add-one-row-to-pandas-dataframe](https://stackoverflow.com/questions/10715965/add-one-row-to-pandas-dataframe) – FObersteiner Nov 30 '20 at 15:27

3 Answers3

1

Suppose you have this df:

                 date  value
0  2020-01-31 00:00:00      1
1  2020-02-01 00:00:00      2
2  2020-02-02 00:00:00      3

then an alternative for adding future days is

df.append(pd.DataFrame({'date': pd.date_range(start=df.date.iloc[-1], periods=6, freq='D', closed='right')}))

which returns

                 date  value
0  2020-01-31 00:00:00    1.0
1  2020-02-01 00:00:00    2.0
2  2020-02-02 00:00:00    3.0
0  2020-02-03 00:00:00    NaN
1  2020-02-04 00:00:00    NaN
2  2020-02-05 00:00:00    NaN
3  2020-02-06 00:00:00    NaN
4  2020-02-07 00:00:00    NaN

where the frequency is D (days) day and the period is 6 days.

  • This works fine as long as I use a datetime field as a separate column from the index. I was hoping to use the datetime index format *as* my index for forecasting. The example works, but uses the integer instead of date indexing for the rows. – Clay Nov 30 '20 at 18:35
0

I think I was making this more difficult than necessary because I was using a datetime index instead of the typical integer index. By leaving the 'date' field as a regular column instead of an index adding the rows is straightforward.

One thing I did do was add a reindex command so I did not end up with wonky duplicate index values:

df = df.append(pd.DataFrame({'date': pd.date_range(start=df.date.iloc[-1], periods=21, freq='D', closed='right')}))
df = df.reset_index() # resets index
Clay
  • 21
  • 1
  • 2
0

i also needed this and i solve merging the code that you share with the code on this other response add to a dataframe as I go with datetime index and end out with the following code that work for me.

data=raw.copy()
new_datetime = data.index[-1:] # current end of datetime index
increment = '1 days' # string for increment - eventually will be in a for loop to add add'l days
new_datetime = new_datetime+pd.Timedelta(increment)
today_df = pd.DataFrame({'value': 301.124},index=new_datetime)
data = data.append(today_df)
data.tail()

here 'value' is the header of your own dataframe