64

I am having difficulty with reading an image, extracting features for training, and testing on new images in OpenCV using SVMs. can someone please point me to a great link? I have looked at the OpenCV Introduction to Support Vector Machines. But it doesn't help with reading in images, and I am not sure how to incorporate it.


My goals are to classify pixels in an image. These pixel would belong to a curves. I understand forming the training matrix (for instance, image A 1,1 1,2 1,3 1,4 1,5 2,1 2,2 2,3 2,4 2,5 3,1 3,2 3,3 3,4 3,5

I would form my training matrix as a [3][2]={ {1,1} {1,2} {1,3} {1,4} {1,5} {2,1} ..{} }

However, I am a little confuse about the labels. From my understanding, I have to specify which row (image) in the training matrix corresponds, which corresponds to a curve or non-curve. But, how can I label a training matrix row (image) if there are some pixels belonging to the curve and some not belonging to a curve. For example, my training matrix is [3][2]={ {1,1} {1,2} {1,3} {1,4} {1,5} {2,1} ..{} }, pixels {1,1} and {1,4} belong to the curve but the rest does not.

ROMANIA_engineer
  • 54,432
  • 29
  • 203
  • 199
Carnez Davis
  • 863
  • 2
  • 8
  • 12
  • 1
    In OpenCV 3.x, the SVM access procedures are a bit different. For people looking for them, http://stackoverflow.com/questions/27114065/opencv-3-svm-training , this link will provide the proper syntax to follow @Walfie's answer properly. – Saksham Sharma Apr 14 '17 at 06:59

1 Answers1

221

I've had to deal with this recently, and here's what I ended up doing to get SVM to work for images.

To train your SVM on a set of images, first you have to construct the training matrix for the SVM. This matrix is specified as follows: each row of the matrix corresponds to one image, and each element in that row corresponds to one feature of the class -- in this case, the color of the pixel at a certain point. Since your images are 2D, you will need to convert them to a 1D matrix. The length of each row will be the area of the images (note that the images must be the same size).

Let's say you wanted to train the SVM on 5 different images, and each image was 4x3 pixels. First you would have to initialize the training matrix. The number of rows in the matrix would be 5, and the number of columns would be the area of the image, 4*3 = 12.

int num_files = 5;
int img_area = 4*3;
Mat training_mat(num_files,img_area,CV_32FC1);

Ideally, num_files and img_area wouldn't be hardcoded, but obtained from looping through a directory and counting the number of images and taking the actual area of an image.

The next step is to "fill in" the rows of training_mat with the data from each image. Below is an example of how this mapping would work for one row.

Convert 2D image matrix to 1D matrix

I've numbered each element of the image matrix with where it should go in the corresponding row in the training matrix. For example, if that were the third image, this would be the third row in the training matrix.

You would have to loop through each image and set the value in the output matrix accordingly. Here's an example for multiple images:

Training matrix with multiple images

As for how you would do this in code, you could use reshape(), but I've had issues with that due to matrices not being continuous. In my experience I've done something like this:

Mat img_mat = imread(imgname,0); // I used 0 for greyscale
int ii = 0; // Current column in training_mat
for (int i = 0; i<img_mat.rows; i++) {
    for (int j = 0; j < img_mat.cols; j++) {
        training_mat.at<float>(file_num,ii++) = img_mat.at<uchar>(i,j);
    }
}

Do this for every training image (remembering to increment file_num). After this, you should have your training matrix set up properly to pass into the SVM functions. The rest of the steps should be very similar to examples online.

Note that while doing this, you also have to set up labels for each training image. So for example if you were classifying eyes and non-eyes based on images, you would need to specify which row in the training matrix corresponds to an eye and a non-eye. This is specified as a 1D matrix, where each element in the 1D matrix corresponds to each row in the 2D matrix. Pick values for each class (e.g., -1 for non-eye and 1 for eye) and set them in the labels matrix.

Mat labels(num_files,1,CV_32FC1);

So if the 3rd element in this labels matrix were -1, it means the 3rd row in the training matrix is in the "non-eye" class. You can set these values in the loop where you evaluate each image. One thing you could do is to sort the training data into separate directories for each class, and loop through the images in each directory, and set the labels based on the directory.

The next thing to do is set up your SVM parameters. These values will vary based on your project, but basically you would declare a CvSVMParams object and set the values:

CvSVMParams params;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::POLY;
params.gamma = 3;
// ...etc

There are several examples online on how to set these parameters, like in the link you posted in the question.

Next, you create a CvSVM object and train it based on your data!

CvSVM svm;
svm.train(training_mat, labels, Mat(), Mat(), params);

Depending on how much data you have, this could take a long time. After it's done training, however, you can save the trained SVM so you don't have to retrain it every time.

svm.save("svm_filename"); // saving
svm.load("svm_filename"); // loading

To test your images using the trained SVM, simply read an image, convert it to a 1D matrix, and pass that in to svm.predict():

svm.predict(img_mat_1d);

It will return a value based on what you set as your labels (e.g., -1 or 1, based on my eye/non-eye example above). Alternatively, if you want to test more than one image at a time, you can create a matrix that has the same format as the training matrix defined earlier and pass that in as the argument. The return value will be different, though.

Good luck!

Walfie
  • 3,376
  • 4
  • 32
  • 38
  • @Walfie If we use different images like your mention `int img_area = 4*3;` but if we have every picture of different area than what should we do –  Jul 26 '13 at 16:42
  • 1
    @Wish_2_fly Each row in the matrix has to be the same size, so in cases where your images are of different sizes, it's up to your application to decide how to handle it. Personally I think I'd choose some fixed size and upsample/downsample an image if it doesn't fit that size. This can be done either on the input images themselves or when you're reading them into your program to populate the matrix. – Walfie Jul 26 '13 at 18:33
  • Thanks @Walfie , i am trying to train my svm on vehicles , so can i use the same approach which you describe above ? –  Jul 27 '13 at 08:43
  • @Walfie +1, With the help of your answer i write my code and link with this question , but got runtime error , when i check it line by line , i got to know that i am getting the value of `int ii=0` consistently zero , why –  Aug 02 '13 at 13:29
  • @Flying My example wasn't really meant to be followed exactly, but my first thought would be that the image might not be loaded correctly (causing the row/column loops to not run). It's hard to say without knowing the specifics of your program, though. – Walfie Aug 02 '13 at 14:10
  • @Walfie Can we use the same one line image resize approach to use it with surf ? – Rocket Aug 21 '13 at 11:06
  • @Angel I haven't used SURF before, but looking at the documentation briefly, it seems like it uses a different input format. – Walfie Aug 21 '13 at 14:13
  • Perhaps not exactly your case, but you might employ SVM to use some of your image properties: contour roundness, eccentricity, etc., and find those criterions that matter in your case. This way it would be better adopted to unexpected inout images. Anyways, nice answer. Already upvoted. – Cynichniy Bandera Nov 24 '15 at 20:54
  • @Walfie Thanks for this answer. I am confused at this line **training_mat.at(file_num,ii++) = img_mat.at(i,j);** It gave me an error, do you have any hint for me for solving this error? – Ahmet Dec 14 '15 at 22:42
  • @Walfie, Thanks for this answer. Your explanation is so good that I almost did my food recognition project using this example as reference. Though my project is working, I have one observation that I want to get clarified. I used reshape to get 1D matrix from image. I could see that the training matrix build up with colors of food that I'm using to train. Then I had to use convertTo function on this training matrix to change its type to CV_32FC1. When I do this conversion, all the colors in the training matrix disappear and it becomes blank(no color). Is this how it is supposed to work? Thanks – jai.maruthi May 10 '16 at 17:39
  • @Walfie Hey, I have 2 different type of images in two different file directory, how am I suppose to label it according to the directory? something like **Mat labels(num_files,1,CV_32FC1);** with num_files being the images in folder one and **Mat labels(num_test,-1,CV_32FC1);** being the images in folder 2? – Lyber Feb 09 '17 at 03:45
  • This is a great answer that help me a lot to get some results. I wish you have given some explanations and hints on setting the parameters. – Koray Feb 10 '17 at 08:38
  • 2
    While this is a very good explanation, I have to mention that usually you don't do classification on entire images but on features of those images. For example: extract HOG's from the images and use that to train the classifier. – LandonZeKepitelOfGreytBritn Aug 20 '17 at 14:33