I got a problem training SVMLinear with caret. The data works just fine with SVMRadial though.
The data is accessible via (29/05/2016): https://www.dropbox.com/s/ia2vc25uhxdgqn1/projetTest01.txt?dl=0
(8000 lines of 1021 variables, ~10% target)
Here's the code:
projetTest01<-read.table("projetTest01.txt", sep="\t")
Test01<-list(data=projetTest01[,-c(2,3)],label=projetTest01[,3])
Test01N<-Test01
Test01N$label<-as.factor(Test01$label)
levels(Test01N$label)[levels(Test01N$label)=="0"] <- "No"
levels(Test01N$label)[levels(Test01N$label)=="1"] <- "Yes"
temp<-as.matrix(Test01$data)
storage.mode(temp) <- "numeric" #I need 'num' type
Test01N$data<-as.data.frame(temp)
svmTuneGrid_L <- data.frame(.C = 2^(-2:7))
trControl_SVML<-trainControl(method = "repeatedcv", repeats = 3, classProbs = TRUE)
svmFit_Lin <- train(Test01N$label ~ ., data = Test01N$data,method = "svmLinear",preProc = c("center", "scale"), tuneGrid = svmTuneGrid_L,trControl = trControl_SVML)
And I got these messages:
line search fails [..]
Warning in method$predict(modelFit = modelFit, newdata = newdata, submodels = param) : kernlab class prediction calculations failed; returning NAs
Warning in data.frame(..., check.names = FALSE) : row names were found from a short variable and have been discarded
I looked up the site/the web for some answers, but
- the levels aren't numeric (=yes/no)
- the ClassProb is set to TRUE
- the labels can't be predicted perfectly from another variable (I know this from other algorithms)
- there isn't a empty class
- preproc(scale) or not doesn't make a difference
- And the data works just fine with SVMRadial!!
- I use caret 6.0-68
I really am at a loss. An idea someone?