7

I am trying to use opencv EM algorithm to do color extraction.I am using the following code based on example in opencv documentation:

cv::Mat capturedFrame ( height, width, CV_8UC3 );
int i, j;
int nsamples = 1000;
cv::Mat samples ( nsamples, 2, CV_32FC1 );
cv::Mat labels;
cv::Mat img = cv::Mat::zeros ( height, height, CV_8UC3 );
img = capturedFrame;
cv::Mat sample ( 1, 2, CV_32FC1 );
CvEM em_model;
CvEMParams params;
samples = samples.reshape ( 2, 0 );

    for ( i = 0; i < N; i++ )
    {           
        //from the training samples
        cv::Mat samples_part = samples.rowRange ( i*nsamples/N, (i+1)*nsamples/N);

        cv::Scalar mean (((i%N)+1)*img.rows/(N1+1),((i/N1)+1)*img.rows/(N1+1));
        cv::Scalar sigma (30,30);
        cv::randn(samples_part,mean,sigma);                     

    }       

    samples = samples.reshape ( 1, 0 );

    //initialize model parameters
    params.covs         = NULL;
    params.means        = NULL;
    params.weights      = NULL;
    params.probs        = NULL;
    params.nclusters    = N;
    params.cov_mat_type = CvEM::COV_MAT_SPHERICAL;
    params.start_step   = CvEM::START_AUTO_STEP;
    params.term_crit.max_iter = 300;
    params.term_crit.epsilon  = 0.1;
    params.term_crit.type   = CV_TERMCRIT_ITER|CV_TERMCRIT_EPS;     
    //cluster the data
    em_model.train ( samples, Mat(), params, &labels );     

    cv::Mat probs;
    probs = em_model.getProbs();

    cv::Mat weights;
    weights = em_model.getWeights();

cv::Mat modelIndex = cv::Mat::zeros ( img.rows, img.cols, CV_8UC3 );

for ( i = 0; i < img.rows; i ++ )
{
    for ( j = 0; j < img.cols; j ++ )
    {
        sample.at<float>(0) = (float)j;
    sample.at<float>(1) = (float)i;     

    int response = cvRound ( em_model.predict ( sample ) ); 
    modelIndex.data [ modelIndex.cols*i + j] = response;

    }
}

My question here is:

Firstly, I want to extract each model, here totally five, then store those corresponding pixel values in five different matrix. In this case, I could have five different colors seperately. Here I only obtained their indexes, is there any way to achieve their corresponding colors here? To make it easy, I can start from finding the dominant color based on these five GMMs.

Secondly, here my sample datapoints are "100", and it takes about nearly 3 seconds for them. But I want to do all these things in no more than 30 milliseconds. I know OpenCV background extraction, which is using GMM, performs really fast, below 20ms, that means, there must be a way for me to do all these within 30 ms for all 600x800=480000 pixels. I found predict function is the most time consuming one.

E_learner
  • 3,512
  • 14
  • 57
  • 88
  • 1
    Is this question still active? Or was it solved [there](http://stackoverflow.com/questions/12909343/opencv-how-to-categorize-gmm-calculated-probs/12909985#12909985) ? Regards – remi Oct 19 '12 at 12:59
  • @remi: this question was an old one, but after I asked another question that you answered, I updated this one with color extraction and calculation speed. Could you help me? Thank you. – E_learner Oct 19 '12 at 17:32
  • 1
    I dont really understand this question. Extracting colors does not make sense to me. Are you trying to compute the dominant colors? Or quantize the colors? Your code dont help me much. Concerning speed issue, using `params.cov_mat_type = COV_MAT_DIAGONAL` is enough for most cases and will speed up your process – remi Oct 19 '12 at 20:51
  • @remi I am trying to extract the each color of a scene, starting from the dominant one. Please help me on this topic. Thank you. – E_learner Oct 20 '12 at 05:21
  • @remi I tried "params.cov_mat_type = COV_MAT_DIAGONAL" but it didn't make any big difference. – E_learner Oct 21 '12 at 08:14

1 Answers1

12

First Question:

In order to do color extraction you first need to train the EM with your input pixels. After that you simply loop over all the input pixels again and use predict() to classify each of them. I've attached a small example that utilizes EM for foreground/background separation based on colors. It shows you how to extract the dominant color (mean) of each gaussian and how to access the original pixel color.

#include <opencv2/opencv.hpp>

int main(int argc, char** argv) {

    cv::Mat source = cv::imread("test.jpg");

    //ouput images
    cv::Mat meanImg(source.rows, source.cols, CV_32FC3);
    cv::Mat fgImg(source.rows, source.cols, CV_8UC3);
    cv::Mat bgImg(source.rows, source.cols, CV_8UC3);

    //convert the input image to float
    cv::Mat floatSource;
    source.convertTo(floatSource, CV_32F);

    //now convert the float image to column vector
    cv::Mat samples(source.rows * source.cols, 3, CV_32FC1);
    int idx = 0;
    for (int y = 0; y < source.rows; y++) {
        cv::Vec3f* row = floatSource.ptr<cv::Vec3f > (y);
        for (int x = 0; x < source.cols; x++) {
            samples.at<cv::Vec3f > (idx++, 0) = row[x];
        }
    }

    //we need just 2 clusters
    cv::EMParams params(2);
    cv::ExpectationMaximization em(samples, cv::Mat(), params);

    //the two dominating colors
    cv::Mat means = em.getMeans();
    //the weights of the two dominant colors
    cv::Mat weights = em.getWeights();

    //we define the foreground as the dominant color with the largest weight
    const int fgId = weights.at<float>(0) > weights.at<float>(1) ? 0 : 1;

    //now classify each of the source pixels
    idx = 0;
    for (int y = 0; y < source.rows; y++) {
        for (int x = 0; x < source.cols; x++) {

            //classify
            const int result = cvRound(em.predict(samples.row(idx++), NULL));
            //get the according mean (dominant color)
            const double* ps = means.ptr<double>(result, 0);

            //set the according mean value to the mean image
            float* pd = meanImg.ptr<float>(y, x);
            //float images need to be in [0..1] range
            pd[0] = ps[0] / 255.0;
            pd[1] = ps[1] / 255.0;
            pd[2] = ps[2] / 255.0;

            //set either foreground or background
            if (result == fgId) {
                fgImg.at<cv::Point3_<uchar> >(y, x, 0) = source.at<cv::Point3_<uchar> >(y, x, 0);
            } else {
                bgImg.at<cv::Point3_<uchar> >(y, x, 0) = source.at<cv::Point3_<uchar> >(y, x, 0);
            }
        }
    }

    cv::imshow("Means", meanImg);
    cv::imshow("Foreground", fgImg);
    cv::imshow("Background", bgImg);
    cv::waitKey(0);

    return 0;
}

I've tested the code with the following image and it performs quite good.

enter image description here

Second Question:

I've noticed that the maximum number of clusters has a huge impact on the performance. So it's better to set this to a very conservative value instead of leaving it empty or setting it to the number of samples like in your example. Furthermore the documentation mentions an iterative procedure to repeatedly optimize the model with less-constrained parameters. Maybe this gives you some speed-up. To read more please have a look at the docs inside the sample code that is provided for train() here.

AD-530
  • 1,059
  • 9
  • 9
  • I tried your code, and it works quite fine, except its calculation speed. Anyway, I will try to handle that. Thank you so much for your answer. – E_learner Oct 23 '12 at 06:59
  • Well, am I right in thinking that you want to apply the algorithm in real-time to some image stream? If yes maybe you not need to train the EM in every frame but train it with the first image and then just predict in the consecutive frames or if you need to train in each image then start with the values from the previous train and COV_MAT_DIAGONAL (Please refer to the code doc inside the example given in the OpenCV documentation for the train method) – AD-530 Oct 23 '12 at 11:42
  • it is not the traning part that is time consuming, but it is the "predict" part. I am handing video frames, and for prediction of one 600x800 sized frame it takes about 3 seconds! Do you have any other idea for speed it up? – E_learner Oct 23 '12 at 13:02
  • 1
    Well if you try to track an object you can make use of spatial coherency and apply predict() just to a small patch around the last known position of the object. For example the patch could be a circle that is positioned on the last known centroid of the object and has a radius a bit larger than the last known radius. Of course you can use more sophisticated approaches like Kalman-Filtering, but for a quick start this should be enough. – AD-530 Oct 23 '12 at 14:23
  • Thank you so much for your suggestion. Meanwhile, I am wondering how could OpenCV MoG is able to subtract backgrounds using GMM so fast? I looked into the code, seems like it is using k-means algorithm, but not sure, but still couldn't fully understand it. For the same video frame, it only uses about 30 ms. – E_learner Oct 23 '12 at 14:26
  • The approach is described in this paper: http://personal.ee.surrey.ac.uk/Personal/R.Bowden/publications/avbs01/avbs01.pdf – AD-530 Oct 23 '12 at 14:35