Questions tagged [svm]

Support vector machines (SVMs) are a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis.

From Wikipedia:

Support vector machines (SVMs) are a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis. The standard SVM takes a set of input data and predicts, for each given input, which of two possible classes the input is a member of, which makes the SVM a non-probabilistic binary linear classifier. Since an SVM is a classifier, then given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other. Intuitively, an SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.

4561 questions
395
votes
6 answers

What are advantages of Artificial Neural Networks over Support Vector Machines?

ANN (Artificial Neural Networks) and SVM (Support Vector Machines) are two popular strategies for supervised machine learning and classification. It's not often clear which method is better for a particular project, and I'm certain the answer is…
Channel72
  • 24,139
  • 32
  • 108
  • 180
83
votes
5 answers

What is the relation between the number of Support Vectors and training data and classifiers performance?

I am using LibSVM to classify some documents. The documents seem to be a bit difficult to classify as the final results show. However, I have noticed something while training my models. and that is: If my training set is for example 1000 around 800…
Hossein
  • 40,161
  • 57
  • 141
  • 175
81
votes
6 answers

Scikit Learn SVC decision_function and predict

I'm trying to understand the relationship between decision_function and predict, which are instance methods of SVC (http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html). So far I've gathered that decision function returns pairwise…
Peter Tseng
  • 1,294
  • 1
  • 12
  • 15
70
votes
2 answers

SVM - hard or soft margins?

Given a linearly separable dataset, is it necessarily better to use a a hard margin SVM over a soft-margin SVM?
D.G
  • 753
  • 2
  • 7
  • 6
64
votes
1 answer

using OpenCV and SVM with images

I am having difficulty with reading an image, extracting features for training, and testing on new images in OpenCV using SVMs. can someone please point me to a great link? I have looked at the OpenCV Introduction to Support Vector Machines. But it…
Carnez Davis
  • 863
  • 2
  • 8
  • 12
58
votes
5 answers

Making SVM run faster in python

Using the code below for svm in python: from sklearn import datasets from sklearn.multiclass import OneVsRestClassifier from sklearn.svm import SVC iris = datasets.load_iris() X, y = iris.data, iris.target clf =…
Abhishek Bhatia
  • 9,404
  • 26
  • 87
  • 142
58
votes
6 answers

Does the SVM in sklearn support incremental (online) learning?

I am currently in the process of designing a recommender system for text articles (a binary case of 'interesting' or 'not interesting'). One of my specifications is that it should continuously update to changing trends. From what I can tell, the…
Michael Aquilina
  • 5,352
  • 4
  • 33
  • 38
55
votes
8 answers

How to do multi class classification using Support Vector Machines (SVM)

In every book and example always they show only binary classification (two classes) and new vector can belong to any one class. Here the problem is I have 4 classes(c1, c2, c3, c4). I've training data for 4 classes. For new vector the output should…
mlguy
  • 741
  • 2
  • 6
  • 6
53
votes
5 answers

Feature Selection and Reduction for Text Classification

I am currently working on a project, a simple sentiment analyzer such that there will be 2 and 3 classes in separate cases. I am using a corpus that is pretty rich in the means of unique words (around 200.000). I used bag-of-words method for feature…
clancularius
  • 877
  • 1
  • 9
  • 12
52
votes
5 answers

Converting LinearSVC's decision function to probabilities (Scikit learn python )

I use linear SVM from scikit learn (LinearSVC) for binary classification problem. I understand that LinearSVC can give me the predicted labels, and the decision scores but I wanted probability estimates (confidence in the label). I want to continue…
chet
  • 607
  • 1
  • 6
  • 11
51
votes
2 answers

Why is scikit-learn SVM.SVC() extremely slow?

I tried to use SVM classifier to train a data with about 100k samples, but I found it to be extremely slow and even after two hours there was no response. When the dataset has around 1k samples, I can get the result immediately. I also tried…
C. Gary
  • 525
  • 1
  • 4
  • 4
48
votes
6 answers

Pointers to some good SVM Tutorial

I have been trying to grasp the basics of Support Vector Machines, and downloaded and read many online articles. But still am not able to grasp it. I would like to know, if there are some nice tutorial sample code which can be used for…
Alphaneo
  • 12,079
  • 22
  • 71
  • 89
47
votes
4 answers

Determining the most contributing features for SVM classifier in sklearn

I have a dataset and I want to train my model on that data. After training, I need to know the features that are major contributors in the classification for a SVM classifier. There is something called feature importance for forest algorithms, is…
Jibin Mathew
  • 4,816
  • 4
  • 40
  • 68
46
votes
6 answers

How to split data on balanced training set and test set on sklearn

I am using sklearn for multi-classification task. I need to split alldata into train_set and test_set. I want to take randomly the same sample number from each class. Actually, I amusing this function X_train, X_test, y_train, y_test =…
Jeanne
  • 1,241
  • 3
  • 19
  • 28
46
votes
2 answers

How does sklearn.svm.svc's function predict_proba() work internally?

I am using sklearn.svm.svc from scikit-learn to do binary classification. I am using its predict_proba() function to get probability estimates. Can anyone tell me how predict_proba() internally calculates the probability?
user2115183
  • 851
  • 2
  • 9
  • 13
1
2 3
99 100