11

I know that LIBSVM only allows one-vs-one classification when it comes to multi-class SVM. However, I would like to tweak it a bit to perform one-against-all classification. I have tried to perform one-against-all below. Is this the correct approach?

The code:

TrainLabel;TrainVec;TestVec;TestLaBel;
u=unique(TrainLabel);
N=length(u);
if(N>2)
    itr=1;
    classes=0;
    while((classes~=1)&&(itr<=length(u)))
        c1=(TrainLabel==u(itr));
        newClass=c1;
        model = svmtrain(TrainLabel, TrainVec, '-c 1 -g 0.00154'); 
        [predict_label, accuracy, dec_values] = svmpredict(TestLabel, TestVec, model);
        itr=itr+1;
    end
itr=itr-1;
end

I might have done some mistakes. I would like to hear some feedback. Thanks.

Second Part: As grapeot said : I need to do Sum-pooling (or voting as a simplified solution) to come up with the final answer. I am not sure how to do it. I need some help on it; I saw the python file but still not very sure. I need some help.

Amro
  • 123,847
  • 25
  • 243
  • 454
lakshmen
  • 28,346
  • 66
  • 178
  • 276
  • What's the question exactly? You are asking how to perform one-vs-all classification with LibSVM? Does the program output the result you expected? BTW, the LibSVM parameters should be `'-c 1 -g 0.00153'` (you lacked the ending single quote). – grapeot Jan 21 '12 at 13:30
  • 1
    @lakesh: I posted an answer to a similar question, you might find useful: http://stackoverflow.com/a/9049808/97160 – Amro Jan 31 '12 at 20:06

3 Answers3

10
%# Fisher Iris dataset
load fisheriris
[~,~,labels] = unique(species);   %# labels: 1/2/3
data = zscore(meas);              %# scale features
numInst = size(data,1);
numLabels = max(labels);

%# split training/testing
idx = randperm(numInst);
numTrain = 100; numTest = numInst - numTrain;
trainData = data(idx(1:numTrain),:);  testData = data(idx(numTrain+1:end),:);
trainLabel = labels(idx(1:numTrain)); testLabel = labels(idx(numTrain+1:end));
%# train one-against-all models
model = cell(numLabels,1);
for k=1:numLabels
    model{k} = svmtrain(double(trainLabel==k), trainData, '-c 1 -g 0.2 -b 1');
end

%# get probability estimates of test instances using each model
prob = zeros(numTest,numLabels);
for k=1:numLabels
    [~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
    prob(:,k) = p(:,model{k}.Label==1);    %# probability of class==k
end

%# predict the class with the highest probability
[~,pred] = max(prob,[],2);
acc = sum(pred == testLabel) ./ numel(testLabel)    %# accuracy
C = confusionmat(testLabel, pred)                   %# confusion matrix
lakshmen
  • 28,346
  • 66
  • 178
  • 276
4

From the code I can see you are trying to first turn the labels into "some class" vs "not this class", and then invoke LibSVM to do training and testing. Some questions and suggestions:

  1. Why are you using the original TrainingLabel for training? In my opinion, should it be model = svmtrain(newClass, TrainVec, '-c 1 -g 0.00154');?
  2. With modified training mechanism, you also need to tweak the prediction part, such as using sum-pooling to determine the final label. Using -b switch in LibSVM to enable probability output will also improve the accuracy.
grapeot
  • 1,594
  • 10
  • 21
  • thanks alot... btw, do u know how to do one vs one using LIBSVM? i am not sure how to do it... – lakshmen Jan 21 '12 at 14:47
  • 1
    Simply putting labels other than 0<=>1 or -1<=>1 as input is fine. LibSVM will recognize it and try to do multi-class classification. – grapeot Jan 21 '12 at 15:20
  • btw it is giving me this error when i change it to newClass : Error: label vector and instance matrix must be double model file should be a struct array – lakshmen Jan 21 '12 at 15:42
  • when i change newClass=c1; to newClass=double(c1);, it gives me 0% classification – lakshmen Jan 21 '12 at 15:45
  • Maybe you can track in to check the value of c1? Is it having both 1 and 0? – grapeot Jan 21 '12 at 16:24
  • 1
    An official implementation in python of one-against-all in python based on LibSVM can be found in the website: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/multilabel/ – grapeot Jan 21 '12 at 16:27
  • If you wish to calculate the classification accuracy directly from LibSVM, make sure the ground truth fed to SVMPredict is correct, i.e. they should be like `(TestLabel == itr)` rather than `TestLabel` themselves. Or you can write your own implementation to calculate the precision/recall. – grapeot Jan 21 '12 at 17:13
  • Once again thanks.. sorry for troubling you.. the pred label are 0 and 1s. Shouldn't it be in numbers that i am using like 1 to 6? – lakshmen Jan 21 '12 at 17:29
  • Yes, that's by design. Note you are solving the problem by a series of binary classifiers. Therefore the output of the SVMs is binary, but you need to do a sum-pooling (or voting as a simplified solution) to come up with the final answer. You may consult the python file mentioned before. :) – grapeot Jan 21 '12 at 17:52
  • @grapeot please can you help me here? thank you a lot https://stackoverflow.com/questions/65449934/multi-class-svm-one-vs-one-always-giving-the-same-label – Christina Dec 25 '20 at 17:04
1

Instead of probability estimates, you can also use the decision values as follows

[~,~,d] = svmpredict(double(testLabel==k), testData, model{k});
prob(:,k) = d * (2 * model{i}.Label(1) - 1);

to achieve the same purpose.

Mihai Iorga
  • 39,330
  • 16
  • 106
  • 107
Venkata
  • 11
  • 2