When I try to run my random forest (for classification) I get the warning
Warning message:
In randomForest.default(m, y, ...) :
The response has five or fewer unique values. Are you sure you want to do regression?
I already cleaned my (huge) dataset with the janitor package and tried to factor the variables. Does anyone understand why I still get this warning?
data2 <- experimental_data
x = janitor::clean_names(data2)
#--------------------------------------
#Partition data
set.seed(93)
ind <- sample(2, nrow(x), replace= TRUE,prob=c(0.7,0.3))
train <- x[ind==1,]
test<- x[ind==2,]
str(train)
train[sapply(train, is.character)] <- lapply(train[sapply(train, is.character)],
as.factor)
str(train)
#Train Random forest on UCI heart dataset
rf <- randomForest(y_full~., data=train, importance=TRUE, predict.all=TRUE,proximity=TRUE)