1

I'm working on an OpenCV program to calculate the center of a red object in view .

  • In the image matrix I am working on, I've already filtered the image in a way that anything slightly red shows up as 255 as the matrix element.

  • I am finding the locations of all elements that are 255 by using np.where()

  • I am using np.mean() to calculate the average value of indices [0] and [1] in the 2D array acquired by np.where() to calculate the "center" coordinate

Here is the snippet of my code which calculates the "centroid" of all red objects laid out in view.

red_only_array = np.array(red_only)
locations = np.where(red_only==255)
x_avg = np.mean(locations[1])
y_avg = np.mean(locations[0])

I repeat this process on different 10 different object matrices labeled by cv2.ConnectedComponents(). I am doing this to get the centroid of individual red objects this time.

Following is the code I'm using

_,labels = cv2.connectedComponents(red_only_array, connectivity = 8)
b = np.matrix(labels)

Obj1= b==1
Obj1 = np.uint8(Obj1)
Obj1[Obj1>0] =255
c1_max = np.where(Obj1 ==255)
centroid1 = np.array([np.mean(c1_max[1]),np.mean(c1_max[0])])

Obj2= b==2
Obj2 = np.uint8(Obj2)
Obj2[Obj2>0] =255
c2_max = np.where(Obj2 ==255)
centroid2 = np.array([np.mean(c2_max[1]),np.mean(c2_max[0])])

The above code repeats until b ==10

Right now I get about 160ms delay on my Raspberry Pi 4 with 8GB RAM. My co-worker thinks np.where() is the bottleneck in my code. Is there any way to further optimize this? My target loop time is 50ms.

Thanks

  • do you only have to recognize one red object, or the biggest red object, or all the red objects? Do you know the approximate size of the red object in terms of number of pixels? – Crawl Cycle Nov 04 '20 at 20:38
  • Crawl Cycle, all the red objects. I guess I should've included a few other parts of my code. Please check my post again in a few minutes. – Hansol Moon Nov 04 '20 at 20:43
  • the code that you posted works when there is only 1 red object. – Crawl Cycle Nov 04 '20 at 20:50
  • The first part calculates all red pixels in a screen as a whole, second snippet I posted labels connected red pixels and split them up into different matricis – Hansol Moon Nov 04 '20 at 20:53
  • 1
    Did you profile the code and make sure that the line containing `np.where` is slowest?? – Crawl Cycle Nov 04 '20 at 21:20
  • Did your coworker even profiled the code? You have unnecessary array allocations, you can just set c1_max = np.where(labels == 1), creation of b matrix is also not required. – unlut Nov 04 '20 at 21:26
  • Also connectedcomponents is not a very fast function, measure how long it takes first. – unlut Nov 04 '20 at 21:32
  • unlut, what would be the alternative to connectedcomponents if that ended up being the bottleneck? As you suggested, I will profile the code first. – Hansol Moon Nov 04 '20 at 21:37
  • Please tell us what is acceptable performance and acceptable failure rate/mode. How big are the images? How many images do you have to process every second? Why does your existing code fail to meet the required performance? Is it too slow or too error-prone or use too much memory? Tell us also the details about the red objects: are they uniformly red? do you treat overlapping red objects as the same red object? Generally, how many red objects in each image? – Crawl Cycle Nov 04 '20 at 21:52
  • Crawl Cycle, Sorry for not going into too much detail from the initial post. The purpose of this code is to calculate the centroid and diameter of a circle formed by multiple red-dots in camera view. Red dots are mostly uniform red circles. The diameter information is used to control an actuator. On a PC, I get about 60ms with my current code which is acceptable. But since this will be used for an embedded system, I need to run it on something smaller. Raspberry Pi 4 has loop time of about 160ms with my code which would cause instability in the system. Thank you for your help and patience – Hansol Moon Nov 04 '20 at 22:00
  • If your bottleneck indeed turns out to be where, you can check here https://stackoverflow.com/questions/33281957/faster-alternative-to-numpy-where, if its not please upload a sample image and code we can run. – unlut Nov 04 '20 at 22:04
  • Crawl Cycle, ulut, my target speed is 50ms. But I think I found where the bottleneck is while trying to post the code with the image. It's most likely the video stream from cap = cv2.VideoCapture(1) and cap.read(). I was modifying the code to post here and took a screencap of the image and ran the code without video stream. The loop time is now 5ms. I think the camera might either have too much information or slow communication with the computer – Hansol Moon Nov 04 '20 at 22:40
  • Don't let cap.read() block the main thread/process if cap.read() is horribly slow. – Crawl Cycle Nov 04 '20 at 22:44
  • Crawl Cycle, One thing I am not sure on is whether it is the cap.read() that is being slow or if it's the webcam that has poor data transfer rate. If latter, are there any machine vision webcams that you would recommend? – Hansol Moon Nov 04 '20 at 22:48
  • 1
    connectedcomponentswithstats() returns the centroids. See https://docs.opencv.org/4.1.1/d3/dc0/group__imgproc__shape.html#ga107a78bf7cd25dec05fb4dfc5c9e765f – fmw42 Nov 04 '20 at 23:04
  • `cap.read()`: https://stackoverflow.com/questions/58293187/opencv-real-time-streaming-video-capture-is-slow-how-to-drop-frames-or-get-sync – Crawl Cycle Nov 04 '20 at 23:32

1 Answers1

0

Profiling 1:

When there are 100 connected components and the image size is 2000x2000, finding the centroid is the slowest step. The whole program takes 28 seconds to run on a laptop.

centroid is slow

from skimage import measure
from skimage import filters
import numpy as np
import cProfile


def make_blobs(size=256, n_blobs=12):
    np.random.seed(1)
    im = np.zeros((size, size))
    points = size * np.random.random((2, n_blobs ** 2))
    im[(points[0]).astype(np.int), (points[1]).astype(np.int)] = 1
    im = filters.gaussian(im, sigma=size / (4. * n_blobs))
    blobs = im > 0.7 * im.mean()
    return blobs


def faster_centroid(img):
    s = 1 / np.mean(img)
    shape = img.shape
    x_coords = np.arange(shape[0])
    y_coords = np.arange(shape[1])
    x_mean = np.mean(img * x_coords[:, np.newaxis]) * s
    y_mean = np.mean(img * y_coords[np.newaxis, :]) * s
    return x_mean, y_mean


def label_blobs(blobs):
    all_labels = measure.label(blobs)
    blobs_labels = measure.label(blobs, background=0)
    return all_labels, blobs_labels


def find_all_centroids(all_labels):
    max_ix = np.max(all_labels)
    centroid_list = []
    for i in range(max_ix + 1):
        centroid = faster_centroid(all_labels == i)
        centroid_list.append(centroid)
    return centroid_list


def main():
    blobs = make_blobs(2000, n_blobs=100)
    # Label connected regions of an integer array.
    all_labels, blobs_labels = label_blobs(blobs)
    print(all_labels)
    all_centroids = find_all_centroids(all_labels)
    print(all_centroids)


cProfile.run("main()", "results.cprofile")

Profiling 2:

[['<function get_centroids1 at 0x7f0027ba6280>', 1.3774937389971456], 
['<function get_centroids2 at 0x7f0027ba6310>', 2.308947408993845], 
['<function get_centroids3 at 0x7f0027ba63a0>', 0.695534451995627]]

4.262 main  red3.py:61
├─ 2.245 get_centroids2  red3.py:36
│  ├─ 1.258 [self]  
│  └─ 0.954 mean  <__array_function__ internals>:2
│        [5 frames hidden]  <__array_function__ internals>, numpy...
│           0.954 ufunc.reduce  <built-in>:0
├─ 1.334 get_centroids1  red3.py:25
│  ├─ 1.031 where  <__array_function__ internals>:2
│  │     [3 frames hidden]  <__array_function__ internals>, <buil...
│  │        1.031 implement_array_function  <built-in>:0
│  ├─ 0.188 [self]  
│  └─ 0.080 mean  <__array_function__ internals>:2
│        [5 frames hidden]  <__array_function__ internals>, numpy...
└─ 0.683 get_centroids3  red3.py:51
   ├─ 0.333 <dictcomp>  red3.py:57
   ├─ 0.233 nonzero  <__array_function__ internals>:2
   │     [5 frames hidden]  <__array_function__ internals>, numpy...
   └─ 0.048 [self]  
from skimage import measure
from skimage import filters
import numpy as np
#import cProfile
from pyinstrument import Profiler
import timeit


def make_blobs(size=256, n_blobs=12):
    np.random.seed(1)
    im = np.zeros((size, size))
    points = size * np.random.random((2, n_blobs ** 2))
    im[(points[0]).astype(np.int), (points[1]).astype(np.int)] = 1
    im = filters.gaussian(im, sigma=size / (4. * n_blobs))
    blobs = im > 0.7 * im.mean()
    return blobs


def label_blobs(blobs):
    all_labels = measure.label(blobs)
    blobs_labels = measure.label(blobs, background=0)
    return all_labels, blobs_labels


def get_centroids1(all_labels):
    n_blobs = np.max(all_labels) + 1
    centroid_list = []
    for i in range(n_blobs):
        locations = np.where(all_labels == i)
        x_avg = np.mean(locations[1])
        y_avg = np.mean(locations[0])
        centroid_list.append([x_avg, y_avg])
    return centroid_list


def get_centroids2(all_labels):
    n_blobs = np.max(all_labels) + 1
    centroid_list = []
    for i in range(n_blobs):
        img = (all_labels == i)
        s = 1 / np.mean(img)
        shape = img.shape
        x_coords = np.arange(shape[0])
        y_coords = np.arange(shape[1])
        x_mean = np.mean(img * x_coords[:, np.newaxis]) * s
        y_mean = np.mean(img * y_coords[np.newaxis, :]) * s
        centroid_list.append([x_mean, y_mean])
    return centroid_list


def get_centroids3(x):
    # https://stackoverflow.com/questions/32748950/
    n_blobs = np.max(x) + 1
    nz = np.nonzero(x)
    coords = np.column_stack(nz)
    nzvals = x[nz[0], nz[1]]
    res = {k: coords[nzvals == k] for k in range(1, n_blobs + 1)}
    return res


def main():
    f_list = [get_centroids1, get_centroids2, get_centroids3]
    blobs = make_blobs(2000, n_blobs=5)
    # Label connected regions of an integer array.
    all_labels, blobs_labels = label_blobs(blobs)

    profiler = Profiler()
    profiler.start()
    timings = []
    for f in f_list:
        s = timeit.default_timer()
        for i in range(10):
            r = f(all_labels)
        e = timeit.default_timer()
        print(r)
        timings.append([str(f), e - s])

    print(timings)
    profiler.stop()
    print(profiler.output_text(unicode=True, color=True))


main()
Crawl Cycle
  • 257
  • 2
  • 8
  • Thank you for your detailed answer. I was looking at the wrong portion of the code to solve my problem. But I believe this will prove very useful as well – Hansol Moon Nov 04 '20 at 22:43