0

I want to build a regression model by grouping a categorical variable(Item_ID)

I tried doing the following, however i am getting the following error:

a <- train[,predict(lm(Number_Of_Sales ~ Year + Month + Day + Weekday + day_of_year  + Category_3  + Category_2  + Category_1 + Weeknum), test[.BY]), by = Item_ID]

b <- test[,predict(lm(Number_Of_Sales ~ Year + Month + Day + Weekday + day_of_year  + Category_3  + Category_2  + Category_1 + Weeknum, data = train[.BY]), newdata=.SD),by = Item_ID]
Error in `[.data.frame`(test, , predict(lm(Number_Of_Sales ~ Year + Month +  : 
  unused argument (by = Item_ID)

Item_ID is present in both test and train datasets. I tried using train$Item_ID, but that was also not working. Could you please help on this?

***** Updated question to reproduce the error ****

train <- data.frame(state=rep(c('MA', 'NY'), c(10, 10)),
                year=rep(1:10, 2),
                response=c(rnorm(10), rnorm(10)))


test <- data.frame(state=rep(c('MA', 'NY'), c(5, 5)),
                    year=rep(1:5, 2),
                    response=c(rnorm(5), rnorm(5)))


a <- train[,predict(lm(response ~ Year), test[.BY]), by = state]

Error received:

Error in `[.data.frame`(train, , predict(lm(response ~ Year), test[.BY]),  : 
  unused argument (by = state)
Jaap
  • 81,064
  • 34
  • 182
  • 193

0 Answers0