5

I trained a model with RBF kernel-based support vector machine regression. I want to know the features that are very important or major contributing features for the RBF kernel-based support vector machine. I know there is a method to know the most contributing features for linear support vector regression based on weight vectors which are the size of the vectors. However, for the RBF kernel-based support vector machine, since the features are transformed into a new space, I have no clue how to extract the most contributing features. I am using scikit-learn in python. Is there a way to extract the most contributing features in RBF kernel-based support vector regression or non-linear support vector regression?

from sklearn import svm
svm = svm.SVC(gamma=0.001, C=100., kernel = 'linear')

In this case: Determining the most contributing features for SVM classifier in sklearn does work very well. However, if the kernel is changed in to

from sklearn import svm
svm = svm.SVC(gamma=0.001, C=100., kernel = 'rbf')

The above answer doesn't work.

PV8
  • 5,799
  • 7
  • 43
  • 87
Bitewulign
  • 59
  • 2
  • 4
  • 1
    Does this answer your question? [Determining the most contributing features for SVM classifier in sklearn](https://stackoverflow.com/questions/41592661/determining-the-most-contributing-features-for-svm-classifier-in-sklearn) – drp Nov 19 '19 at 07:25
  • 1
    Thanks for your suggestion. This doesn't answer my question. from sklearn import svm svm = svm.SVC(gamma=0.001, C=100., kernel = 'rbf') In this case, the feature importance doesn't work. – Bitewulign Nov 19 '19 at 09:08
  • Possible duplicate of [How to obtain features' weights](https://stackoverflow.com/questions/21260691/how-to-obtain-features-weights) – PV8 Nov 19 '19 at 09:20
  • 1
    this gives you a reason why it will not work: https://stackoverflow.com/questions/21260691/how-to-obtain-features-weights – PV8 Nov 19 '19 at 09:20
  • 1
    You can remove some of your features and measure the impact on your accuracy, this might give you a hint at your feature importance. – Benjamin Breton Nov 19 '19 at 10:17
  • To find the most useful features for your SVM you do not necessarily need to use an SVM. If you calculate feature importance with a random forest, the results will transfer reasonably well to your SVM. – Anton Nov 19 '19 at 16:24

2 Answers2

2

Let me sort the comments as an answer:

As you can read here:

Weights asigned to the features (coefficients in the primal problem). This is only available in the case of linear kernel.

but also it doesn't make sense. In linear SVM the resulting separating plane is in the same space as your input features. Therefore its coefficients can be viewed as weights of the input's "dimensions".

In other kernels, the separating plane exists in another space - a result of kernel transformation of the original space. Its coefficients are not directly related to the input space. In fact, for the rbf kernel the transformed space is infinite-dimensional.

As menionted in the comments, things you can do:

Play with the features (leave some out), and see how the accuracy will change, this will give you an idea which features are important.

If you use other classifier as random forest, you will get the feature importances, for the other algorithm. But this will not answer your question which is important for your svm. So this does not necessarily answer your question.

PV8
  • 5,799
  • 7
  • 43
  • 87
0

In relation with the inspection of non linear SVM models (e.g. using RBF kernel), here I share an answer posted in another thread which might be useful for this purpose.

The method is based on "sklearn.inspection.permutation_importance".

And here, a compressive discussion about the significance of "permutation_importance" applied on SVM models.

George
  • 1