0

I have to do my first assignment to multiple linear regression in R. I have two datasets. A training dataset and a testing dataset. I have made a model(logmodel with Multiple R-squared: 0.7904, which unfortunately doesn't satisfy the normality and homoscedasticity) and the aim is to predict the total rentals for bikes(the 18th column in the 2 datasets). Here is my model:

model <- lm(formula = cnt ~ yr + hr + weathersit + temp + hum, data = databikecleaned)
w <- 1/(lm(abs(model$residuals)~model$fitted.values)$fitted.values^2) 
logmodel <- lm(formula = log(cnt) ~ yr + hr + weathersit + temp + hum + I(hum^2) + I(temp^2) ,weight = w, data = databikecleaned)

How can I do it and how to test if the predictions that my model makes are good? I am a beginner that's why I may have silly questions. I try this function but an error occurs

predict <- predict(logmodel,test_data)

"Error in hum^2 : non-numeric argument to binary operator"

rmse(predict, test_data[,18])
Josef
  • 21,998
  • 3
  • 54
  • 67
ArgyGr
  • 57
  • 7
  • To get great answers quickly, make your question [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Which function causes the error? `predict()` or `rmse`? – Kat Jan 03 '22 at 20:58
  • Could you provide your input data `databikecleaned` with `dput`? Possibly its some missing value/NA value in the `hum`-column. Also, it makes your question reproducible. – JKupzig Jan 04 '22 at 08:09
  • How can I to make my question reproducible? In predict() the error is occured, of course rmse doesn't run either due to the error in predict. Is there another way to make a prediction from my logmodel to the values of the 18th column in the 2nd datasets(test dataset)? – ArgyGr Jan 04 '22 at 09:17

0 Answers0