2

First I create some toy data:

n_samples=20
X=np.concatenate((np.random.normal(loc=2, scale=1.0, size=n_samples),np.random.normal(loc=20.0, scale=1.0, size=n_samples),[10])).reshape(-1,1)
y=np.concatenate((np.repeat(0,n_samples),np.repeat(1,n_samples+1)))
plt.scatter(X,y)

Below the graph to visualize the data:

enter image description here

Then I train a model with LinearSVC

from sklearn.svm import LinearSVC
svm_lin = LinearSVC(C=1)
svm_lin.fit(X,y)

My understand for C is that:

  • If C is very big, then misclassifications will not be tolerated, because the penalty will be big.
  • If C is small, misclassifications will be tolerated to make the margin (soft margin) larger.

With C=1, I have the following graph (the orange line represent the predictions for given x values), and we can see the decision boundary is around 7, so C=1 is big enough to not let any misclassification.

X_test_svml=np.linspace(-1, 30, 300).reshape(-1,1)
plt.scatter(X,y)
plt.scatter(X_test_svml,svm_lin.predict(X_test_svml),marker="_")
plt.axhline(.5, color='.5')

enter image description here

With C=0.001 for example, I am expecting the decision boundary to go to right-hand side, around 11 for example, but I got this:

enter image description here

I tried with another module with the SVC function:

from sklearn.svm import SVC
svc_lin = SVC(kernel = 'linear', random_state = 0,C=0.01)
svc_lin.fit(X,y)

I successfully got the desired output:

enter image description here

And with my R code, I got something more understandable. (I used svm function from e1071 package)

enter image description here

seralouk
  • 30,938
  • 9
  • 118
  • 133
John Smith
  • 1,604
  • 4
  • 18
  • 45

1 Answers1

3

LinearSVC and SVC(kernel=linear) are not the same thing.

The differences are:

  • SVC and LinearSVC are supposed to optimize the same problem, but in fact all liblinear estimators penalize the intercept, whereas libsvm ones don't (IIRC).
  • This leads to a different mathematical optimization problem and thus different results.
  • There may also be other subtle differences such as scaling and default loss function (edit: make sure you set loss='hinge' in LinearSVC).
  • Next, in multiclass classification, liblinear does one-vs-rest by default whereas libsvm does one-vs-one.

See also: https://stackoverflow.com/a/33844092/5025009

seralouk
  • 30,938
  • 9
  • 118
  • 133
  • in the [official documentation of scikit learn](https://scikit-learn.org/stable/modules/svm.html#mathematical-formulation), it seems that the math formula doesn't indicate the intercept is penalized. or do I misunderstand? – John Smith Oct 08 '20 at 07:44