15

Following up from Invalid probability model for large support vector machines using ksvm in R:

I am training an SVM using ksvm from the kernlab package in R. I want to use the probability model, but during the sigmoid fitting I get the following error message:

line search fails -1.833726 0.5772808 5.844462e-05 5.839508e-05 -1.795008e-08 
-1.794263e-08 -2.096847e-12

When this happens, the resulting value of prob.model(m) is a vector of all probabilities, rather than the expected parameters of a sigmoid function fitted over these probabilities. What causes this error and how can I prevent it? Searching for the error message yielded no results.

Reproducible example:

load(url('http://roelandvanbeek.nl/files/df.rdata'))
ksvm(label~value,df[1:1000],C=10,prob.model=TRUE)->m 
prob.model(m) # works as it should, prints a list containing one named list

# the below, non-working problem, unfortunately takes an hour due to the large
# sample size
ksvm(label~value,df,C=10,prob.model=TRUE)->m # line search fails  
prob.model(m) # just a vector of values
Community
  • 1
  • 1
roelandvanbeek
  • 659
  • 8
  • 20
  • Did you manage to figure this out? – Vishal Belsare Jun 03 '13 at 17:56
  • 4
    No. I did found that it also occurs with smaller data sets, but have not yet been able to find a consistent explanation. Often, reducing or increasing the number of observations fixes the problem, which adds to the irregularity of its nature... – roelandvanbeek Jul 03 '13 at 12:12
  • @roelandvanbeek, i see the problem when i try to plot the learning curve for my dataset, but when i run only for certain splits 70/30 for example, it does not show the issue? is this what you mean by reducing or increasing observations? – E B Sep 23 '17 at 03:14

3 Answers3

1

Looking at the source code, this is the line that throws that error.

It's on the method .probPlatt using the Newton method to optimize the function, in this case Platt's scaling. If you check line 3007 though you'll see some parameters pertaining to the method.

One of such parameters is minstep basically the minimal numeric step the method should keep trying to optimize the function. You see, this is exactly the condition of the error in line 3090: if (stepsize < minstep). So, basically, the function is not converging, even when reaching the minimum step size.

You can try changing minstep to lower values to circumvent it. Alexandros even commented these parameters should probably be in the interface.

catastrophic-failure
  • 3,759
  • 1
  • 24
  • 43
  • are you saying that we should change the code and recompile it ? – E B Sep 23 '17 at 03:14
  • @EB Yes about changing the code, recompilation isn't strictly required though. – catastrophic-failure Sep 23 '17 at 15:09
  • @catastrophic-failure I do not understand the behavior of the optimizer. if max iteration is reached, no problem. but if step is lower than min_step it calls `.SigmoidPredict` which does not return A and B. I do not think that the solution is to decrease min_step, but not to call `.SigmoidPredict`. Thoughts? – Elad663 Apr 05 '19 at 22:26
0

It seems to me that the problem occurs randomly. Thus, I circumvented the problem by fitting the ksvm model as many times until it worked.

stop.crit = 1
while (stop.crit <= 10) {
    stop.crit = stop.crit + 1
    MOD = ksvm(...)
    tryCatch(PRED = predict(...), error = function(e) e)
    if (exists("PRED") == TRUE) stop.crit = 11
}
0

I do not understand the behavior of the optimizer. if max iteration is reached, no problem. but if step is lower than min_step it calls .SigmoidPredict which does not return A and B. I do not think that the solution is to decrease min_step, but not to call .SigmoidPredict, so I commented it out. btw, I do not understand why they do not use glm to estimate A and B.

here's a repository based on the latest source from cran with the call to SigmoidPredict commented out.

devtools::install_github('elad663/kernlab')

Elad663
  • 783
  • 1
  • 5
  • 13