I am using a for loop, and i need to predict multiple columns and store them the same time.
cols is a vector containing all the columns i need to predict, mat is data.frame (my text features basically).
df is main dataframe having text, and prediction columns.
for (colm in cols){
label <- as.factor(df[[colm]])
dfm <- mat
dfm[[colm]] <- label
#Boruta(as.factor(colm)~., data=dfm, pValue = 0.01, mcAdj = TRUE, maxRuns = 20,
# doTrace = 2, holdHistory = TRUE, getImp = getImpRfZ) -> Bor.rf
#dfm <- as.data.frame(as.matrix(dfm[,getSelectedAttributes(Bor.rf)]))
#dfm[[colm]] <- label
#train the RF model
modelRF.bor <- train(colm~., data=dfm, method="rf", trControl=control)
pred.RF.bor = predict(modelRF.bor, newdata = dfm[ ,!(colnames(dfm) == st(colm))])
print("Predictions for Column")
print(colm)
print(pred.RF.bor)
table(pred.RF.bor,dfm$colm)
acc.RF.bor = mean(pred.RF.bor==dfm$colm)
print("Accuracy ")
print(acc.RF.bor)
print("Confusion Matrix")
print(confusionMatrix(table(pred.RF.bor,dfm$colm)))
output[,i] <- pred.RF.bor
i = i+1
}
I am getting this error, and have checked everything in my code, and also similar questions here.
Error in model.frame.default(form = colm ~ ., data = dfm, na.action = na.fail) :
variable lengths differ (found for 'excel')
Don't know what to do, please suggest.