How to derive the equation for a non-linear time series regression model built in R?

Question

I've built a non-linear time series regression model in R that I would like to write down as an equation, so that I can back-test against my data in an Excel spreadsheet. I've created a .ts object and created a model using the tslm function, as shown below:

model16 <- tslm(production ~ date + I(date^2) + I(date^3) +
                   I(temp_neg_32^3) +
                   I(humidity_avg^3) +
                   I(dew_avg^3) +
                  below_freezing_min, 
                data = production_temp_no_outlier.ts)

I find the coefficients for each variable in the model by using the following code:

summary(model16)

The output is below:

So, my understanding is that the equation of my model should be:

y = -7924000000 + 1268000*date -67.62*(date^2) + 0.001202*(date^3) +
0.04395*(temp_neg_32^3) + 0.008658*(humidity_avg^3) -0.03762*(dew_avg^3) + -11930*below_freezing_min

However, whenever I plug the data into this equation, the output is just completely off - it has nothing in common with the fitted curve visualization that I build in R based on this model. So I am clearly doing something wrong. I will be very grateful if someone could help point out my errors!

It would help if you can [make this question reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by including some or all of the data `production_temp_no_outlier.ts` in plain text format. Also include all the relevant code: _e.g._ the library (forecast?) which contains the `tslm` function. — neilfws, Feb 07 '22 at 22:45

score 1 · Accepted Answer · edited Feb 08 '22 at 15:19

This use of regression doesn't give you an exact fit, it gives you the line of best fit. What is the coefficient of determination? (AKA explained variance or R^2)

Take a look at this set of data (somewhat modeled after your example).

library(forecast)
library(tidyverse)

data("us_change", package = "fpp3")

fit <-  tslm(Production~Savings + I(Savings^2) + I(Savings^3) + I(Income^3) + Unemployment,
             data = ts(us_change))
summary(fit)

Here I extracted the coefficients, so I can show you a bit more of what I mean. Then I created a function that calculates the outcome of the regression equation.

cFit <- coefficients(fit)
#   (Intercept)       Savings  I(Savings^2)  I(Savings^3)   I(Income^3)  Unemployment 
#  5.221684e-01  6.321979e-03 -2.472784e-04 -6.376422e-06  7.029079e-03 -3.144743e+00  


regFun <- function(cFit, data){
  attach(data)
  f = cFit[[2]] * Savings + cFit[[3]] * Savings^2 + cFit[[4]] * Savings^3 + cFit[[5]] * Income^3 + Unemployment + cFit[[1]]
  detach(data)
  return(f)
}

Here are some examples of the predicted outcome versus the actual outcome.

fitOne <- regFun(cFit, us_change[1,])
# [1] 1.455793 

us_change[1,]$Production
# [1] -2.452486 

fitTwo <- regFun(cFit, us_change[2,])
# [1] 1.066338 

us_change[2,]$Production
# [1] -0.5514595 

fitThree <- regFun(cFit, us_change[3,])
# [1] 1.08083 

us_change[3,]$Production
# [1] -0.3586518

You can tell from the variance here that the production volume is not explained very well by the inputs I provided.

Now look at what happens when I graph this:

plt <- ggplot(data = us_change %>% 
                mutate(Regression = regFun(cFit, us_change)),
       aes(x = Production)) +
  geom_point(aes(y = Savings, color = "Savings")) +
  geom_point(aes(y = Savings^2, color = "Savings^2")) + 
  geom_point(aes(y = Savings^3, color = "Savings^3")) +
  geom_point(aes(y = Savings^3, color = "Savings^3")) +
  geom_point(aes(y = Unemployment, color = "Unemployment")) +
  geom_line(aes(y = Regression, color = "Regression")) +  # regression line
  scale_color_viridis_d(end = .8) + theme_bw()

plotly::ggplotly(plt)

The regression equation output is the black line. It's the best fit, but there are values that are not represented all that well.

If you look closer, it's not a straight line either.

How to derive the equation for a non-linear time series regression model built in R?

1 Answers1