I was playing around with linear regression models in r, specifically taking the log of my variables and then making predictions off of the model. I ran into a somewhat minor issue but I'm curious as to what is happening. For simplicity, say I have one variable and the response. I take the log of both variables, but I format them in the following ways:
m1<-lm(log(response)~log(variable))
log_response<- log(response)
log_variable<- log(variable)
m2<- lm(log_response~log_variable)
Both model summaries produce the same output so I would assume the 2 models are equivalent. However, when I try to make a prediction, I get an error with m2.
newdata<-data.frame(variable=2)
predict(m1, newdata, interval="predict")
predict(m2, newdata, interval="predict")
Using that, the prediction for m1 will produce an accurate output, but m2 will return an error that looks like
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : variable lengths differ (found for 'log_variable') In addition: Warning message: 'newdata' had 1 row but variables found have 805 rows
Am I making some mistake in creating the log variables?