0

I'm building a linear regression model for a time series by using the lagged variable (x_t-1,x_t-2,xt-3...x_t-k) as features where k is a parameter I can set.

So the target variable is xt (or x_t-0) and predictors are all the k lagged variable. I have train a regression model and checked for predictions against test set.

Now I wish to make forecasts for data that are not in my time series, starting from the last date of this time series until a given date.

To achieve this, I thought of building a new data frame where the first row is the last row of my lagged dataframe, and then I need to add new rows following this logic:

In row 1: xt is the new predicted value, x(t-1).... x(t-k) are the same as the last row in the lagged dataframe.

In row 2: X(t) is the new predicted value using row(1), x(t-1) is x(t) of row 1, x(t-2) is x(t-from row 1 ...etc

In row 3: X(t) is the new predicted value using row(2), x(t-1) is x(t) from row 2, x(t-2) is x(t-1) from row 2 ...etc

And repeat this process for n_days to forecast, so at the end in the new dataframe I would have n rows.

I couldn't come up with a code using pandas to do this, I'm always repeating the same values in each row, so I would really appreciate some help.

wwnde
  • 26,119
  • 6
  • 18
  • 32
Souames
  • 1,115
  • 3
  • 11
  • 22
  • 1
    Show us "some" code even if does not work please. Along with small representative subset of your input data and desired output (by hand). – k1m190r Apr 19 '20 at 19:14
  • It sounds like you are trying to do something like ARIMA. You might want to read more about that. Perhaps this question helps? https://stackoverflow.com/questions/22770352/auto-arima-equivalent-for-python – mcskinner Apr 19 '20 at 19:52

0 Answers0