0
[enter link description here][1]

Recently,I have writte a script to train a random forest model to classifier land use/cover type using randomForest package in R.I will get the different overall accuracy and kappa statistics when I run the script 10 times.Now, I want to retrain my model using K-fold cross-validation,but I don't know how to do this and how to find a optimal model? And If I retrain my model using K-fold cross-validation,how can I get the average overall accuracy and kappa statistics?

Does anyone have some experiences or some worked examples?That will be very appreciate.Thank you very much.

My code as follows:

cat("Calculating random forest object\n")

randfor <-randomForest(as.factor(response)~.,data=trainvals,importance=TRUE, na.action=na.omit,proximity=TRUE)

#try to print randomForest model and see the important features
print(randfor)

#Try to see the margin, positive or negative, if positive it means   

#correct classification

rf.margin <- margin(randfor,responseTest)

plot(rf.margin)


#display the error rates of a randforForest

plot(randfor)

#Predict the land cover type of the test datasets

pred <- predict(randfor,newdata = trainvalsTest)

#generate a classification table for the testing datasets

rf.table <- table(pred,responseTest)

rf.table

# Plotting variable importance plot

varImpPlot(randfor)

classAgreement(rf.table)

#Print the value of overall accuracy and Kappa Statistic

confusion <- confusionMatrix(pred,responseTest)

confusion


#print the importance of all the input variables
randomForest.importance <- importance(randfor)
randomForest.importance

#using caret package to calculate the variable importance

caret.importance <- varImp(randfor,scale = FALSE)

#print the overalll value of the input variables

print(caret.importance)

#display the variable importance plot

plot(caret.importance)
JimmyGao
  • 55
  • 1
  • 9

0 Answers0