0

I'm trying to build an glm model based on caret package.I would like to use the ROC for choosing the best classification model's parameters.I added summaryFunction=twoClassSummary and classProbs = TRUE to the trainControl function and metric = "ROC" to the train function.

Here is my code:

library('caret')

dat <- read.table(text = " target birds    wolfs     snakes
+       0        3        9         7
+       1        3        8         4
+       1        1        2         8
+       0        1        2         3
+       0        1        8         3
+       1        6        1         2
+       0        6        7         1
+       1        6        1         5
+       0        5        9         7
+       1        3        8         7
+       1        4        2         7
+       0        1        2         3
+       0        7        6         3
+       1        6        1         1
+       0        6        3         9
+       1        6        1         1   ",header = TRUE)

The control function:

 fitControl <- trainControl( method = "repeatedcv",  number = 10,repeats = 10, summaryFunction=twoClassSummary,classProbs = TRUE)

The model:

glm <- train(target~ ., data = dat, method = "glm", trControl = fitControl, tuneLength = 4, metric = "ROC")

I got this error:

 Error in evalSummaryFunction(y, wts = weights, ctrl = trControl, lev = classLevels,  : 
  train()'s use of ROC codes requires class probabilities. See the classProbs option of trainControl()
In addition: Warning message:
In train.default(x, y, weights = w, ...) :
  cannnot compute class probabilities for regression

What am I'm doing wrong?

mql4beginner
  • 2,193
  • 5
  • 34
  • 73

1 Answers1

2

Try the code setting the target column as a factor:

dat$target<-as.factor(dat$target,labels=c("X0","X1"))

NicE
  • 21,165
  • 3
  • 51
  • 68
  • Thanks NicE,I did it but i got a new error:Warning messages: 1: In train.default(x, y, weights = w, ...) : At least one of the class levels are not valid R variables names; This may cause errors if class probabilities are generated because the variables names will be converted to: X0, X1 2: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : There were missing values in resampled performance measures. – mql4beginner Feb 04 '15 at 13:51
  • Check out this post for the answer to your question http://stackoverflow.com/questions/18402016/error-when-i-try-to-predict-class-probabilities-in-r-caret, and this one http://stackoverflow.com/questions/26828901/warning-message-missing-values-in-resampled-performance-measures-in-caret-tra for the second error (I edited my post to change the labels of the factor to solve the first issue) – NicE Feb 04 '15 at 13:55
  • I changed the target variable values into valid textual values but now i get other error regarding the ROC (even after i added classProbs = TRUE to the trainControl function :Error in evalSummaryFunction(y, wts = weights, ctrl = trControl, lev = classLevels, : train()'s use of ROC codes requires class probabilities. See the classProbs option of trainControl() – mql4beginner Feb 04 '15 at 14:36