The dimension of dual_coef_ in sklearn. SVC

Question

In SVC() for multi-classification, the one-vs-one classifiers are trained. So there are supposed to be n_class * (n_class - 1)/2 classifiers in total. But why clf.dual_coef_ returns me only (n_class - 1) * n_SV? What does each row represent then?

eickenberg · Accepted Answer · 2014-04-10T16:40:25.777

The dual coefficients of a sklearn.svm.SVC in the multiclass setting are tricky to interpret. There is an explanation in the scikit-learn documentation. The sklearn.svm.SVC uses libsvm for the calculations and adopts the same data structure for the dual coefficients. Another explanation of the organization of these coefficients is in the FAQ. In the case of the coefficients you find in the fitted SVC classifier, interpretation goes as follows:

The support vectors identified by the SVC each belong to a certain class. In the dual coefficients, they are ordered according to the class they belong to. Given a fitted SVC estimator, e.g.

from sklearn.svm import SVC
svc = SVC()
svc.fit(X, y)

you will find

svc.classes_   # represents the unique classes
svc.n_support_ # represents the number of support vectors per class

The support vectors are organized according to these two variables. Each support vector being clearly identified with one class, it becomes evident that it can be implied in at most n_classes-1 one-vs-one problems, viz every comparison with all the other classes. But it is entirely possible that a given support vector will not be implied in all one-vs-one problems.

Taking a look at

support_indices = np.cumsum(svc.n_support_)
svc.dual_coef_[0:support_indices[0]]  # < ---
                                      # weights on support vectors of class 0
                                      # for problems 0v1, 0v2, ..., 0v(n-1)
                                      # so n-1 columns for each of the 
                                      # svc.n_support_[0] support vectors
svc.dual_coef_[support_indices[1]:support_indices[2]]  
                                      #  ^^^
                                      # weights on support vectors of class 1
                                      # for problems 0v1, 1v2, ..., 1v(n-1)
                                      # so n-1 columns for each of the 
                                      # svc.n_support_[1] support vectors
...
svc.dual_coef_[support_indices[n_classes - 2]:support_indices[n_classes - 1]]
                                      #  ^^^
                                      # weights on support vectors of class n-1
                                      # for problems 0vs(n-1), 1vs(n-1), ..., (n-2)v(n-1)
                                      # so n-1 columns for each of the 
                                      # svc.n_support_[-1] support vectors

gives you the weights of the support vectors for the classes 0, 1, ..., n-1 in their respective one-vs-one problems. Comparisons to all other classes except its own are made, resulting in n_classes - 1 columns. The order in which this happens follows the order of the unique classes exposed above. There are as many rows in each group as there are support vectors.

Possibly what you are looking for are the primal weights, which live in feature space, in order to inspect them as to their "importance" for classification. This is only possible with a linear kernel. Try this

from sklearn.svm import SVC
svc = SVC(kernel="linear")
svc.fit(X, y)  # X is your data, y your labels

Then take a look at

svc.coef_

This is an array of shape ((n_class * (n_class -1) / 2), n_features) and represents the aforementioned weights.

According to the doc the weights are ordered as:

class 0 vs class 1
class 0 vs class 2
...
class 0 vs class n-1
class 1 vs class 2
class 1 vs class 3
...
...
class n-2 vs class n-1

If this is what you are looking for and you have further questions, please don't hesitate to comment — eickenberg, Apr 03 '14 at 08:09
eicken, thank you very much for your input (+1). The link is helpful. If my understanding is correct, the only distinct coefficients in the table shown on that page are a_0(0,1), a_1(0,1), and a_2(0,1)? I'm still confused how can the support vectors be the same between (class 0 vs class 1) and (class 0 vs class 2). — ChuNan, Apr 03 '14 at 19:42
Thanks for the feedback! I added some detail to the description of the dual coefficient organization in the specific case of the sklearn.svm.SVC classifier. HTH — eickenberg, Apr 04 '14 at 09:51
Thank you eicken. I'm sorry that I'm still confused that whether svc.dual_coef_ are support vectors, or the weights on those support vectors? And is it correct that the support vectors(or their weights) are exactly the same (svc.dual_coef_[0:support_indices[0]]) for label 0 vs label 1, 0 vs 2, 0 vs 3, ....0 vs n-1? — ChuNan, Apr 07 '14 at 22:37
Thanks again for the input! I added some comments in the code. To address your 2 questions directly: svc.dual_coef_ are the weights on the support vectors, organized in the way described above. The support vectors themselves are indexed in svc.support_ and the full vectors are contained in svc.support_vectors_ (but I am not sure if the latter is always present). I am not sure if I understand the second question. Take label 0. You will find in svc.dual_coef_[0:support_indices[0]] all the weights for all the problems that you indicate (0v1, 0v2, ...). Some of these weights may be exactly 0. — eickenberg, Apr 10 '14 at 16:55
In this case the support vector (given by the line) is not used in the problem (indicated by the column). Not all support vectors are used in all problems, but all support vectors that are ever used are organized according to their labels. — eickenberg, Apr 10 '14 at 16:56
how can we get coefficients for rbf kernel? and for any custom kernel? — William Scott, Oct 16 '18 at 22:57
That is not possible in general. The primal vectors can be infinite-dimensional depending on the kernel. The primal coefficients are only easy to extract from the dual coefficients in the linear case. — eickenberg, Oct 17 '18 at 18:07
Thanks a lot for responding so soon, didnt realise that you responded on comment. I have an assignment where i should predict the labels using sklearn svm, i can use only fit and nothing else. and this is with linear and poly kernel. any idea how i can do this? basically i could do it with linear by getting the coef_ but no progress with poly kernel. Thanks a lot in advance. — William Scott, Oct 19 '18 at 19:14
This is quite off-topic, but every sklearn estimator has a `predict` method with which you can compute predictions given an input — eickenberg, Oct 19 '18 at 22:54
@eickenberg Hi could you briefly explain why the dual_coef_ output array might contain zero entries sometimes? As I understand the support vectors should have alpha_i all strictly positive, and the n_columns of dual_coef_ reflect the number of support vectors. Just fill in the context: I have fitted an rbf kernel with ovo decision_functionn_shape — siegfried, May 11 '21 at 12:09

The dimension of dual_coef_ in sklearn. SVC

1 Answers1

Linked