6

I am implementing SVM using scikit package in python. I am having difficulty while interpreting the "alpha i" values in plot_separating_hyperplane.py

import numpy as np
import pylab as pl
from sklearn import svm

# we create 40 separable points
np.random.seed(0)
X = np.r_[np.random.randn(20, 2) - [2, 2], np.random.randn(20, 2) + [2, 2]]
Y = [0] * 20 + [1] * 20

# fit the model
clf = svm.SVC(kernel='linear')
clf.fit(X, Y)
print clf.support_vectors_
#support_vectors_ prints the support vectors
print clf.dual_coef_
#dual_coef_ gives us the "alpha i, y i" value for all support vectors

Sample output

Dual_coef_ = [[ 0.04825885  0.56891844 -0.61717729]]
Support Vectors = 
[[-1.02126202  0.2408932 ]
 [-0.46722079 -0.53064123]
 [ 0.95144703  0.57998206]]

Dual_coef_ gives us the "alpha i * y i" values. We can confirm that summation of "alpha i * y i" = 0 (0.04825885 + 0.56891844 - 0.61717729 = 0)

I wanted to find out the "alpha i" values. It should be easy, since we have "alpha i * y i" values. But i'm getting all "alpha i's" to be negative. For example, the point (0.95144703, 0.57998206) lies above the line (see link). So y = +1. If y = +1, alpha will be -0.61717729. Similarly for point (-1.02126202, 0.2408932) lies below the line. So y = -1, and hence alpha = -0.04825885.

Why am I getting alpha values to be negative? Is my interpretation wrong? Any help will be appreciated.


For your reference,

For the Support Vector Classifier (SVC),

Given training vectors , i=1,..., n, in two classes, and a vector such that , SVC solves the following primal problem:

enter image description here

Its dual is

enter image description here

where 'e' is the vector of all ones, C > 0 is the upper bound, Q is an n by n positive semidefinite matrix, enter image description here and enter image description here is the kernel. Here training vectors are mapped into a higher (maybe infinite) dimensional space by the function .

Community
  • 1
  • 1
lostboy_19
  • 347
  • 3
  • 16

1 Answers1

4

I think you just interpret the y wrongly. I guess above the line is y=-1 and below y=+1.

Why did you think it was the other way around?

For a two class problem, I think the "first class". i.e. the one with the smallest number, is +1, the other one -1. This is a LibSVM convention.

Andreas Mueller
  • 27,470
  • 8
  • 62
  • 74
  • 2
    That might be a LibSVM convention, but it doesn't fit the scikit-learn convention. We always order classes by the Python/NumPy ordering of their labels. – Fred Foo Oct 06 '12 at 10:58
  • 1
    I am not entirely sure either ;) I did some sign swapping there at some point... lostboy_19: which version of sklearn are you useing? – Andreas Mueller Oct 08 '12 at 14:07
  • I could reproduce this on recent `master`. – Fred Foo Oct 08 '12 at 14:11
  • @AndreasMueller Thanks for your response. I am using 0.11 version. I downloaded it few months ago. And I really doubt that y=-1 lies above the line. I think according to the mathematical derivation, w.x+b=+1 lies above the line, and w.x+b=-1 lies below the line. – lostboy_19 Oct 09 '12 at 18:02
  • I was using "above" in the sense of the image plane. The image doesn't say anything about which direction w is pointing. I was suggesting it points down. – Andreas Mueller Oct 10 '12 at 08:13