5

I'm trying to optimise some of our computer vision algorithms and decided to benchmark cv::connectedComponents against cv::findContours (and cv::drawContours) to achieve similar results.

Essentially, all we need to do is find blobs in a binary image, then pick the biggest one - a fairly standard operation.

I'm a bit out of touch with the efficiencies in OpenCV, having only used it for algorithm prototyping in Python in the last couple of years, so I decided to run a benchmark of the two methods above.

I'm a bit confused by my results, as this comment seems to suggest that findContours should be much slower, which is the opposite to what I observe (results lower down in the post). I suspected, and indeed my results showed that using findContours on a binary image, then drawing each contour as a different index was marginally faster than running full connectedComponents analysis.

They also showed that calculating just the areas of those contours, rather than the full set of stats from connectedComponentsWithStats was considerably faster.

Am I misunderstanding what's going on here? I would expect the two approaches would give similar results.


Timing Results:

Starting simple benchmark (100000 iterations) ...
2668ms to run 100000 iterations of findContours
3358ms to run 100000 iterations of connectedComponents
Starting area calculation benchmark (100000 iterations) ...
2691ms to run 100000 iterations of findContours
11285ms to run 100000 iterations of connectedComponentsWithStats
AVERAGE TIMES (ms): 
findContours:           0.0267
connectedComps:         0.0336
findContours (areas):   0.0269
connectedComps (areas): 0.113

Benchmarking code below:

#include "opencv2/imgcodecs.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <chrono>
#include <iomanip>

typedef std::chrono::high_resolution_clock Clock;

cv::Mat src;
cv::Mat src_hsv;
cv::Mat hueChannel;

int threshLow = 230;
int threshHigh = 255;

long numRuns = 100000;
long benchmarkContours(long numRuns, cv::Mat &mask, bool calculateAreas = false) {

    auto start = Clock::now();

    std::vector<std::vector<cv::Point>> contours;
    std::vector<cv::Vec4i> hierarchy;
    std::vector<double> areas;

    for (long run = 0; run < numRuns; ++run) {
        cv::Mat markers = cv::Mat::zeros(mask.size(), CV_8UC1);
        cv::findContours(mask.clone(), contours, hierarchy, cv::RETR_EXTERNAL, cv::CHAIN_APPROX_SIMPLE);
        if (calculateAreas) {
            areas = std::vector<double>(contours.size());
        }

        for (unsigned int i = 0; i < contours.size(); i++) {
            if (calculateAreas) {
                areas.push_back(cv::contourArea(contours[i]));
            }
            cv::drawContours(markers, contours, i, cv::Scalar::all(i), -1);
        }
    }

    auto end = Clock::now();

    return std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();
}

long benchmarkConnComp(long numRuns, cv::Mat &mask, bool calculateAreas = false) {

    auto start = Clock::now();

    cv::Mat labeledImage;
    cv::Mat stats;
    cv::Mat centroids;
    for (long run = 0; run < numRuns; ++run) {
        if (calculateAreas) {
            cv::connectedComponentsWithStats(mask, labeledImage, stats, centroids);
        } else {
            cv::connectedComponents(mask, labeledImage);
        }
    }

    auto end = Clock::now();
    return std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();

}

int main(int, char **argv) {
    src = cv::imread(argv[1]);
    if (src.empty()) {
        std::cerr << "No image supplied ..." << std::endl;
        return -1;
    }

    cv::cvtColor(src, src_hsv, cv::COLOR_BGR2HSV_FULL);

    std::vector<cv::Mat> hsvChannels = std::vector<cv::Mat>(3);
    cv::split(src, hsvChannels);

    hueChannel = hsvChannels[0];

    cv::Mat mask;
    cv::inRange(hueChannel, cv::Scalar(threshLow), cv::Scalar(threshHigh), mask);

    std::cout << "Starting simple benchmark (" << numRuns << " iterations) ..." << std::endl;
    long findContoursTime = benchmarkContours(numRuns, mask);
    std::cout << findContoursTime << "ms to run " << numRuns << " iterations of findContours" << std::endl;

    long connCompTime = benchmarkConnComp(numRuns, mask);
    std::cout << connCompTime << "ms to run " << numRuns << " iterations of connectedComponents" << std::endl;
    std::cout << "Starting area calculation benchmark (" << numRuns << " iterations) ..." << std::endl;

    long findContoursTimeWithAreas = benchmarkContours(numRuns, mask, true);
    std::cout << findContoursTimeWithAreas << "ms to run " << numRuns << " iterations of findContours" << std::endl;

    long connCompTimeWithAreas = benchmarkConnComp(numRuns, mask, true);
    std::cout << connCompTimeWithAreas << "ms to run " << numRuns << " iterations of connectedComponentsWithStats" << std::endl;

    std::cout << "AVERAGE TIMES: " << std::endl;
    std::cout << "findContours:           " << std::setprecision(3) << (1.0f * findContoursTime) / numRuns << std::endl;
    std::cout << "connectedComps:         " << std::setprecision(3) << (1.0f * connCompTime) / numRuns <<  std::endl;
    std::cout << "findContours (areas):   " << std::setprecision(3) << (1.0f * findContoursTimeWithAreas) / numRuns << std::endl;
    std::cout << "connectedComps (areas): " << std::setprecision(3) << (1.0f * connCompTimeWithAreas) / numRuns <<  std::endl;
}
n00dle
  • 5,949
  • 2
  • 35
  • 48

1 Answers1

-1

I haven't really looked into those two functions in OpenCV, but I would assume that the connectedcomponents() function is dependent more on image size as it will probably do some kind of multi-threaded rasterisation of the image (line-by-line processing). Whereas the findcontour() function will probably walk the contour in some way so performance will be dependant on the complexity and size of the blob itself rather than the image size.

Nicholas
  • 1,392
  • 16
  • 38