2

i just prepare some papers regarding support vector machines. As it is well know the kernel-trick enables us to transform data implicitly from the input space to some (potentially infinite dimensional) feature space.

As a short reference you can use Cristianini, Nello ; Shawe-Taylor, John: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge: Cambridge University Press, 2000.

Since we then do not know the corresponding feature map, i wonder if there are any estimations about the dimensionality of the feature space, when we use kernels. Especially i would be interested if there are any results, stating when the data is linear separable in the resulting feature space. Maybe somebody knows some (recent) papers about this topic. I would be really interested!

YXD
  • 31,741
  • 15
  • 75
  • 115

2 Answers2

2

There is paper that you might be interested in: Chen et al. On linear separability of data sets in feature space

The authors derived formulas to judge linear separability of two infinite data sets in feature space by information in original input space. They concluded that any two finite sets of data with empty overlap in original input space will become linearly separable in an infinite dimensional feature space. For two infinite data sets, several sufficient and necessary conditions for their linear separability in feature space were also obtained.

lennon310
  • 12,503
  • 11
  • 43
  • 61
1

There are 2 kinds of methods: hard- and soft-margin SVMs. You can read detailed description for both in this question, but in short, only hard-margin SVM requires data to be completely separable. Soft-margin SVMs on other hand permit some percent of mislabeled data, but still perform very well (often even better). Given this, you don't need to check linear separability of you data. Instead, you can just play around with classifier parameters and run cross validation to measure accuracy.

Community
  • 1
  • 1
ffriend
  • 27,562
  • 13
  • 91
  • 132
  • Well i know. I should have added this fact in my initial post. I want to have a criterion which states when the data-points (belonging to two different classes) can be linearly separated by a hard margin SVM in feature space. Of course this is interesting from a theoretical point of view only since it would probably overfit the model. The answer of lennon310 fits pretty good. Thanks all of you! –  Feb 28 '14 at 16:13