I want to build a regression model by grouping a categorical variable(Item_ID)
I tried doing the following, however i am getting the following error:
a <- train[,predict(lm(Number_Of_Sales ~ Year + Month + Day + Weekday + day_of_year + Category_3 + Category_2 + Category_1 + Weeknum), test[.BY]), by = Item_ID]
b <- test[,predict(lm(Number_Of_Sales ~ Year + Month + Day + Weekday + day_of_year + Category_3 + Category_2 + Category_1 + Weeknum, data = train[.BY]), newdata=.SD),by = Item_ID]
Error in `[.data.frame`(test, , predict(lm(Number_Of_Sales ~ Year + Month + : unused argument (by = Item_ID)
Item_ID
is present in both test and train datasets. I tried using train$Item_ID
, but that was also not working. Could you please help on this?
***** Updated question to reproduce the error ****
train <- data.frame(state=rep(c('MA', 'NY'), c(10, 10)),
year=rep(1:10, 2),
response=c(rnorm(10), rnorm(10)))
test <- data.frame(state=rep(c('MA', 'NY'), c(5, 5)),
year=rep(1:5, 2),
response=c(rnorm(5), rnorm(5)))
a <- train[,predict(lm(response ~ Year), test[.BY]), by = state]
Error received:
Error in `[.data.frame`(train, , predict(lm(response ~ Year), test[.BY]), : unused argument (by = state)