9

I want to use the svm classifier for facial expression detection. I know opencv has a svm api, but I have no clue what should be the input to train the classifier. I have read many papers till now, all of them says after facial feature detection train the classifier.

so far what I did,

  1. Face detection,
  2. 16 facial points calculation in every frame. below is an output of facial feature detection![enter image description
  3. A vector which holds the features points pixel addresshere

Note: I know how I can train the SVM only with positive and negative images, I saw this codehere, But I don't know how I combine the facial feature information with it.

Can anybody please help me to start the classification with svm.

a. what should be the sample input to train the classifier?

b. How do I train the classifier with this facial feature points?

Regards,

MMH
  • 1,676
  • 5
  • 26
  • 43

2 Answers2

15

the machine learning algos in opencv all come with a similar interface. to train it, you pass a NxM Mat offeatures (N rows, each feature one row with length M) and a Nx1 Mat with the class-labels. like this:

//traindata      //trainlabels

f e a t u r e    1 
f e a t u r e    -1
f e a t u r e    1
f e a t u r e    1
f e a t u r e    -1

for the prediction, you fill a Mat with 1 row in the same way, and it will return the predicted label

so, let's say, your 16 facial points are stored in a vector, you would do like:

Mat trainData; // start empty
Mat labels;

for all facial_point_vecs:
{
    for( size_t i=0; i<16; i++ )
    {
        trainData.push_back(point[i]);
    }
    labels.push_back(label); // 1 or -1
}
// now here comes the magic:
// reshape it, so it has N rows, each being a flat float, x,y,x,y,x,y,x,y... 32 element array
trainData = trainData.reshape(1, 16*2); // numpoints*2 for x,y

// we have to convert to float:
trainData.convertTo(trainData,CV_32F);

SVM svm; // params omitted for simplicity (but that's where the *real* work starts..)
svm.train( trainData, labels );


//later predict:
vector<Point> points;
Mat testData = Mat(points).reshape(1,32); // flattened to 1 row
testData.convertTo(testData ,CV_32F);
float p = svm.predict( testData );
berak
  • 39,159
  • 9
  • 91
  • 89
  • Hi Break Thanks for your answer but I have a question- -how do I provide the image and the feature points together? means, suppose I have 50 positive images and 20 negative images and every image have 16 feature point, so how do I insert the information which features belog to which image? what should I push_back in the trainData in that case? - why do I multyply 16 with 2 in the 'reshape' line? – MMH Sep 29 '14 at 06:35
  • hmm, when i started typing here, it looked, like you wanted to do emotion detection, like happy/sad. now you edited it a couple of times, and it seems more , that you want face-recognition/people identification, which is a different pair of shoes. could you clarify ? – berak Sep 29 '14 at 06:40
  • oh!! I want to do emotion detection only. for now only happy and sad. – MMH Sep 29 '14 at 06:42
  • ah, ok. note, that there is no connection to the images. (it does not know about images, it only knows your landmark points) all it says in the end is happy or not. – berak Sep 29 '14 at 06:49
  • Ok, I understand but not very clear to me. Suppose image1 has features at (xi,yi) and image2 have features on (Xi2,yi2), so to do SVM we only inser (Xi,Yi) and (Xi2,Yi2)? – MMH Sep 29 '14 at 06:53
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/62094/discussion-between-mmh-and-berak). – MMH Sep 29 '14 at 07:08
  • Hi break, I am having problem with the trainData, can you please help me. suppose I have 50 positive image and every image have 16 2d feature points, Then how do I declare the trainData? float trainData1[16][2]; or should I do float trainData[50][16*2]; – MMH Oct 01 '14 at 01:53
  • In this line trainData.push_back(point[i]); what is point? Is that the vector of keypoints? – dephinera Dec 29 '14 at 18:42
  • @Crash-ID , i think, those points came from a facial-landmark detector, like stasm, flandmark, asm-lib. (not sift or surf like keypoints, or, at least, not directly. landmarking involves a 'smoothing' pass to a pre-trained model) – berak Dec 29 '14 at 18:49
3

Face gesture recognition is a widely researched problem, and the appropriate features you need to use can be found by a very thorough study of the existing literature. Once you have the feature descriptor you believe to be good, you go on to train the SVM with those. Once you have trained the SVM with optimal parameters (found through cross-validation), you start testing the SVM model on unseen data, and you report the accuracy. That, in general, is the pipeline.

Now the part about SVMs:

SVM is a binary classifier- it can differentiate between two classes (though it can be extended to multiple classes as well). OpenCV has an inbuilt module for SVM in the ML library. The SVM class has two functions to begin with: train(..) and predict(..). To train the classifier, you give as in input a very large amount of sample feature descriptors, along with their class labels (usually -1 and +1). Remember the format OpenCV supports: every training sample has to be a row-vector. And each row will have one corresponding class label in the labels vector. So if you have a descriptor of length n, and you have m such sample descriptors, your training matrix would be m x n (m rows, each of length n), and the labels vector would be of length m. There is also a SVMParams object that contains properties like SVM-type and values for parameters like C that you'll have to specify.

Once trained, you extract features from an image, convert it into a single row format, and give to predict() and it'll tell you which class it belongs to (+1 or -1).

There's also a train_auto() with similar arguments with a similar format that gives you the optimum values of the SVM parameters.

Also check this detailed SO answer to see an example.

EDIT: Assuming you have a Feature Descriptor that returns a vector of features, the algorithm would be something like:

Mat trainingMat, labelsMat;
for each image in training database:
  feature = extractFeatures( image[i] );
  Mat feature_row = alignAsRow( feature );
  trainingMat.push_back( feature_row );
  labelsMat.push_back( -1 or 1 );  //depending upon class.
mySvmObject.train( trainingMat, labelsMat, Mat(), Mat(), mySvmParams );

I don't presume that extractFeatures() and alignAsRow() are existing functions, you might need to write them yourself.

Community
  • 1
  • 1
a-Jays
  • 1,182
  • 9
  • 20
  • Thanks for your reply..As I mentioned in my question I know theoretically what I need to do. I know after feature extraction I will have to train the SVM classifier. Also I know that after training I can use predict() to predict the facial expression. So my main question is how do I use these feature points to train the svm classifier?? if you can give some code snipt that will also help. – MMH Sep 29 '14 at 06:20
  • Thanks again, but do I only provide the features? not related images? then how will it relate which features belongs to which image? – MMH Sep 29 '14 at 06:47
  • 2
    No not images (unless the raw image itself is a feature, which is rarely so). You have to extract features from an image to train, and while testing, you again extract the image features. It's not the images that you train and test with, but the corresponding features. – a-Jays Sep 29 '14 at 06:52
  • how do save the trained svm "mySvmObject", i did SvmObject.save("abc.xml");, which is not working :'( – MMH Sep 29 '14 at 07:02
  • Did it give you some error message? It is `mySvmObject.save(..)`. – a-Jays Sep 29 '14 at 07:06
  • no, some thing wrong, in loading the images, let me check – MMH Sep 29 '14 at 07:11