I have read few answers on this here but I am afraid I have not been able to figure out an answer.
My R code is:
colors <- bmw[bmw$Channel=="Colors" & bmw$Hour=20,]
colors_test <- tail(colors, 89)
colors_train <- head(colors, 810)
colors_train_agg <- aggregate(colors_train$Impressions, list(colors_train$`Position of Ad in Break`), FUN=mean, na.rm=TRUE)
colnames(colors_train_agg) <- c("ad_position", "avg_impressions")
lm_colors <- lm(colors_train_agg$avg_impressions ~ poly(colors_train_agg$ad_position, 12))
summary(lm_colors)
colors_test_agg <- aggregate(colors_test$Impressions, list(colors_test$`Position of Ad in Break`), FUN=mean, na.rm=TRUE)
colnames(colors_test_agg) <- c("ad_position", "avg_impressions")
new.df <- data.frame(colors_test_agg$ad_position)
colnames(new.df) <- c("ad_position")
colors_test_test <- predict(lm_colors, newdata=new.df)
So I have exactly the same column names for both training and test data. I still get the warning:
Warning message:
'newdata' had 15 rows but variables found have 22 rows
Can some one suggest what is wrong? Also, I want to know if I am even doing it the right way.
Also, some pointers on how to calculate accuracy of the model will be greatly appreciated. Thanks!