I'm building a linear regression model for a time series by using the lagged variable (x_t-1,x_t-2,xt-3...x_t-k)
as features where k
is a parameter I can set.
So the target variable is xt (or x_t-0)
and predictors are all the k
lagged variable. I have train a regression model and checked for predictions against test set.
Now I wish to make forecasts for data that are not in my time series, starting from the last date of this time series until a given date.
To achieve this, I thought of building a new data frame where the first row is the last row of my lagged dataframe, and then I need to add new rows following this logic:
In row 1: xt
is the new predicted value, x(t-1).... x(t-k)
are the same as the last row in the lagged dataframe.
In row 2: X(t)
is the new predicted value using row(1), x(t-1) is x(t) of row 1, x(t-2) is x(t-from row 1 ...etc
In row 3: X(t)
is the new predicted value using row(2), x(t-1) is x(t) from row 2, x(t-2) is x(t-1) from row 2 ...etc
And repeat this process for n_days to forecast, so at the end in the new dataframe I would have n rows.
I couldn't come up with a code using pandas to do this, I'm always repeating the same values in each row, so I would really appreciate some help.