I am trying to implement Bag of Words in opencv and has come with the implementation below. I am using Caltech 101 database. However, since its my first time and not being familiar, I have planned to used two image sets from the database, the chair image set and the soccer ball image set. I have coded for the svm using this.
Everything went allright, except when I call classifier.predict(descriptor)
, I do not get the label vale as intended. I always get a0
instead of '1', irrespective of my test image. The number of images in the chair dataset is 10
and in the soccer ball dataset is 10
. I labelled chair as 0
and soccer ball as 1
. The links represent the samples of each categories, the top 10 is of chairs, the bottom 10 is of soccer balls
function hello
clear all; close all; clc;
detector = cv.FeatureDetector('SURF');
extractor = cv.DescriptorExtractor('SURF');
links = {
'http://i.imgur.com/48nMezh.jpg'
'http://i.imgur.com/RrZ1i52.jpg'
'http://i.imgur.com/ZI0N3vr.jpg'
'http://i.imgur.com/b6lY0bJ.jpg'
'http://i.imgur.com/Vs4TYPm.jpg'
'http://i.imgur.com/GtcwRWY.jpg'
'http://i.imgur.com/BGW1rqS.jpg'
'http://i.imgur.com/jI9UFn8.jpg'
'http://i.imgur.com/W1afQ2O.jpg'
'http://i.imgur.com/PyX3adM.jpg'
'http://i.imgur.com/U2g4kW5.jpg'
'http://i.imgur.com/M8ZMBJ4.jpg'
'http://i.imgur.com/CinqIWI.jpg'
'http://i.imgur.com/QtgsblB.jpg'
'http://i.imgur.com/SZX13Im.jpg'
'http://i.imgur.com/7zVErXU.jpg'
'http://i.imgur.com/uUMGw9i.jpg'
'http://i.imgur.com/qYSkqEg.jpg'
'http://i.imgur.com/sAj3pib.jpg'
'http://i.imgur.com/DMPsKfo.jpg'
};
N = numel(links);
trainer = cv.BOWKMeansTrainer(100);
train = struct('val',repmat({' '},N,1),'img',cell(N,1), 'pts',cell(N,1), 'feat',cell(N,1));
for i=1:N
train(i).val = links{i};
train(i).img = imread(links{i});
if ndims(train(i).img > 2)
train(i).img = rgb2gray(train(i).img);
end;
train(i).pts = detector.detect(train(i).img);
train(i).feat = extractor.compute(train(i).img,train(i).pts);
end;
for i=1:N
trainer.add(train(i).feat);
end;
dictionary = trainer.cluster();
extractor = cv.BOWImgDescriptorExtractor('SURF','BruteForce');
extractor.setVocabulary(dictionary);
for i=1:N
desc(i,:) = extractor.compute(train(i).img,train(i).pts);
end;
a = zeros(1,10)';
b = ones(1,10)';
labels = [a;b];
classifier = cv.SVM;
classifier.train(desc,labels);
test_im =rgb2gray(imread('D:\ball1.jpg'));
test_pts = detector.detect(test_im);
test_feat = extractor.compute(test_im,test_pts);
val = classifier.predict(test_feat);
disp('Value is: ')
disp(val)
end
These are my test samples:
Soccer Ball
(source: timeslive.co.za)
Chair
Searching through this site I think that my algorithm is okay, even though I am not quite confident about it. If anybody can help me in finding the bug, it will be appreciable.
Following Amro's code , this was my result:
Distribution of classes:
Value Count Percent
1 62 49.21%
2 64 50.79%
Number of training instances = 61
Number of testing instances = 65
Number of keypoints detected = 38845
Codebook size = 100
SVM model parameters:
svm_type: 'C_SVC'
kernel_type: 'RBF'
degree: 0
gamma: 0.5063
coef0: 0
C: 62.5000
nu: 0
p: 0
class_weights: 0
term_crit: [1x1 struct]
Confusion matrix:
ans =
29 1
1 34
Accuracy = 96.92 %