Error in using Non Linear SVM in Scikit-Learn

Question

I have a code to try to use Non Linear SVM (RBF kernel).

raw_data1 = open("/Users/prateek/Desktop/Programs/ML/Dataset.csv")
raw_data2 = open("/Users/prateek/Desktop/Programs/ML/Result.csv")

dataset1 = np.loadtxt(raw_data1,delimiter=",")
result1 = np.loadtxt(raw_data2,delimiter=",")

clf = svm.NuSVC(kernel='rbf')
clf.fit(dataset1,result1)

However, when I try to fit, I get the error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/prateek/Desktop/Programs/ML/lib/python2.7/site-packages/sklearn/svm/base.py", line 193, in fit
    fit(X, y, sample_weight, solver_type, kernel, random_seed=seed)
  File "/Users/prateek/Desktop/Programs/ML/lib/python2.7/site-packages/sklearn/svm/base.py", line 251, in _dense_fit
    max_iter=self.max_iter, random_seed=random_seed)
  File "sklearn/svm/libsvm.pyx", line 187, in sklearn.svm.libsvm.fit (sklearn/svm/libsvm.c:2098)
ValueError: specified nu is infeasible

Link for Results.csv

Link for dataset

What is the reason for such an error?

http://stackoverflow.com/questions/26987248/nu-is-infeasible — , Feb 05 '16 at 10:15
In the code you are using a `sigmoid` kernel but you say that you are using a `RBF` kernel. Which want do you actually want to use? — , Feb 05 '16 at 10:17
@BlackAdder Im sorry. I was even trying sigmoid kernel also. Ive corrected it. — Prateek Narendra, Feb 05 '16 at 10:31
@Evert so I must have values in (0,1) open interval and not [0,1] closed interval? — Prateek Narendra, Feb 05 '16 at 10:33
But in your code you are not specifying `nu` value, so the system is taking the default one: `0.5`. Could you try different `nu` values in the range (0, 1)? — , Feb 05 '16 at 10:41

Guiem Bosch · Accepted Answer · 2016-02-05T17:07:16.747

The nu parameter is, as pointed out in the documentation, "An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors".

So, whenever you try to fit your data and this bound cannot be satisfied, optimization problem becomes infeasible. Therefore your error.

As a matter of fact, I looped from 1. to 0.1 (decreasing in decimal units) and still got the error, then just tried with 0.01 and no complaints arose. But of course, you should check the results of fitting your model with that value, check if accuracy is acceptable on predictions.

Update: actually I was curious and splitted your dataset to validate, output was 69% accuracy (also I think your training set might be very little)

Just for reproducibility purposes, here, the quick test I performed:

from sklearn import svm
import numpy as np 
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score

raw_data1 = open("Dataset.csv")
raw_data2 = open("Result.csv")
dataset1 = np.loadtxt(raw_data1,delimiter=",")
result1 = np.loadtxt(raw_data2,delimiter=",")

clf = svm.NuSVC(kernel='rbf',nu=0.01)
X_train, X_test, y_train, y_test = train_test_split(dataset1,result1, test_size=0.25, random_state=42)
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)
accuracy_score(y_test, y_pred, normalize=True, sample_weight=None)

the reason for such low accuracy are lot of factors. If u see the results.csv, most of them belong to group labeled 1. Also, there was nothing belonging to group 3. Also - yes- the training dataset is just 300+ points. — Prateek Narendra, Feb 06 '16 at 14:41

Error in using Non Linear SVM in Scikit-Learn

1 Answers1