9

I have an image feature extraction problem. The input images are binary (black and white) and may contain blobs of approximately known area and aspect ratio. These need to be fit with ellipses using some best fit algorithm.

Example input:

Desired output:

There may be multiple blobs (zero or more), the number is not known in advance. The approximate area and aspect ratio of all the blobs is known (and is the same). How many are in the image, their position, orientation and actual size are what I'm trying to find. The output should be a best fit ellipse for each blob based on the actual found size and aspect ratio.

What makes this hard is noise and possible overlaps.

Example with noise:

Example with overlap and noise:

The noisy image may have holes in the blobs and also small other blobs scattered around. The small other blobs are not counted because they are too small and do not cover any area densely enough to be considered a real match.

The image with overlap should be counted as two blobs because the area is too big for a single blob to cover it well.

A possible metric that evaluates a potential fit is:

sum over all ellipses of (K1 * percent deviation from expected size + K2 * percent deviation from expected aspect ratio + K3 * percent of ellipse which is not black + K4 * percent overlapped with any other ellipse) + K5 * percent of rest of image which is black

for some suitably chosen parameters K1..K5. A perfect match scores 0.

I can see how to solve this using brute force, for example trying enough different possible fits to sample the search space well. I can't think off-hand of a method much faster than brute force.

I would prefer examples in python and/or opencv. I will try to implement and post any suggested solutions in python. Thanks!

P.S. It cannot be assumed that a blob is connected. There may be enough noise to break it up into discontinuous parts.

P.P.S. The little bits of noise cannot be removed by binary erosion. In some of my images, there are enough interior holes that erosion makes the whole (real) blob disappear if the image is eroded enough to make the noise bits disappear as well.

P.P.P.S. I think that it would be very hard to solve this using any approach based on contours. The data I see in practice has too much edge noise, there can be (and often are) bits of noise that connect separate blobs, or that separate a single blob into several (apparent) connected components. I would like an approach based on areas, since area coverage seems to be much less nosy than the edge shapes.

P.P.P.P.S. As requested, here is an example with a through cut due to noise:

and a sample with lots and lots of noise but nevertheless a distinct blob:

EDIT None of the answers actually solves the problem, although Bharat has suggested a partial solution which does well for non-overlapping blobs. More please :) I will award additional bounty to any actual solutions.

Alex I
  • 19,689
  • 9
  • 86
  • 158
  • 1
    if you have the approximate size and position of each blob (from the previous tracking step or an initial detection), you might want to look at `active contours`. They start with a contour (given size and position) and try to fit to present contours (in the image) without deforming too much. These were used in literature in medical CT scans, where the size/position/orientation of an organ might vary slightly between different scans. OpenCV had a `snake` implementation, but not sure whether it's still int he last version, since it was not so very good. – Micka Jan 20 '14 at 11:21
  • @AlexI have you tried scaling up/down the one with noise example, I think scaling a binary image up will almost remove all of the noise -not sure though- – concept3d Jan 27 '14 at 12:04
  • 2
    Have you considering some heuristic such as maximizing the area contained within each ellipse? If you know before-hand the approximate size/shape of each blob, I imagine you can just shift the ellipse over the image and minimize based on ellipse_area-filled_in_pixels. – Chrismit Jan 27 '14 at 21:23
  • @Chrismit: Yes, I describe a measure of fit just like that in the question. It is not especially fast to optimize :) – Alex I Jan 29 '14 at 11:13

5 Answers5

3

I would try to fit a gaussian mixture model and then use the mean and the covariance matrix to fit ellipses over the data. Such a model would work even with overlap and small noisy blobs. The data which you have would be the the coordinates of the pixels which are black and you could fit a GMM over the data. One issue with this approach is that you need to know the number of blobs you need to track in advance, if you can come up with a heuristic for that, GMM should solve this problem pretty efficiently.

Bharat
  • 2,139
  • 2
  • 16
  • 35
  • Thank you, I tried this out using `sklearn.mixture.GMM`. It does okay on non-noisy inputs, a bit worse with noise, and gets confused by overlapping blobs, because it tries to find two non-overlapping gaussians that the image might be a sum of, usually one gaussian covers most of the two overlapping blobs, and a second much smaller one covers some of the remainder. What I want is not sum of gaussians, but maybe "max of gaussians" :) There is also no way to hint the expected blob size. – Alex I Jan 20 '14 at 07:40
  • 1
    if you are doing tracking over time, then there are some better ways to do this, for a single image you might even look at mean shift / kernel density estimation as you know something about the size of the object – Bharat Jan 20 '14 at 17:55
  • Bharat: I am doing tracking over time, but I think if each individual recognition is 100% accurate the over time part will be really easy :) Could you explain "mean shift/kernel density estimation"? Yes, size of the object is known to +/-20%, as long as it is not partially out of view or occluded. – Alex I Jan 20 '14 at 22:03
  • That said, could you suggest better ways to do this that rely on using a sequence of images over time? – Alex I Jan 20 '14 at 23:58
  • 1
    http://comaniciu.net/Papers/MsTracking.pdf this is the paper. Here is a tutorial on it http://www.cse.psu.edu/~rcollins/CSE598G/introMeanShift.pdf this might take some time to read though – Bharat Jan 21 '14 at 05:02
  • Bounty for suggesting GMM, that is a new take on this problem. – Alex I Jan 30 '14 at 00:38
  • actually, a variant of mean-shift clustering which takes uncertainty (a/b/theta parameters) into account would work better than GMM, because mean shift takes into account the prior about the size while GMM does not, you could first try fitting circles (it is easier to implement), and then look into how to fit ellipses later. – Bharat Jan 30 '14 at 00:58
2

You fill the holes [1], you detect contours [2], and use moments on each contour rectangle to find orientation, eccentricity, etc[3].

PS.: The disconnected contours (noise) can be filtered out by size.

Community
  • 1
  • 1
LovaBill
  • 5,107
  • 1
  • 24
  • 32
  • Interesing, but some holes may go all the way across, so filtering the resulting contours by size would erroneously discard two halves of a real blob. – Alex I Jan 22 '14 at 22:12
  • May you post an example ("all the way across" case)? Will [dilation](http://docs.opencv.org/doc/tutorials/imgproc/erosion_dilatation/erosion_dilatation.html) fix that? – LovaBill Jan 23 '14 at 08:33
  • William: I posted a couple of examples, please see above. I think in general, it is best to think of the image not as binary - it is too noisy for that - but rather as a discrete sample drawn from a continuous probability density function. Eg. just take a gaussian (radius 60-100 or so) and fit blobs to the darker areas. Thresholded gaussians look okay, but I think that loses some information. – Alex I Jan 27 '14 at 08:52
  • 1
    Your last example is very challenging and I don't know if there's a solid answer for that. Probably, I'd start with metaheuristics (like particle swarm optimization) to minimize the computatiotal burden of a greedy search. – LovaBill Jan 27 '14 at 12:41
1

You might start by filtering out contours by area.

About separating overlapping blobs, that might be a tricky one (I would risk to say, impossible for arbitrary overlaps) to do with this binary image, maybe you should do it with the original image or at least some steps back of pre-processing.

OpenCV's fittEllipse will also be helpful.

Rui Marques
  • 8,567
  • 3
  • 60
  • 91
  • To address the overlap issue, if there is an area in the image that is larger than the max allowed blob, I want this to detect it as two blobs (with reasonable orientations/sizes). So there is no ambiguity, anything that is about the size of one blob is one blob, anything larger is two (or more) blobs, and anything smaller is not-a-blob. Hence the last example, that is definitely two blobs. I think contours would have a hard time dealing with both overlapped blobs and holes in blobs that are large enough that the blob appears to be separated in two parts. – Alex I Jan 22 '14 at 22:08
  • If the objects get separated into two parts maybe your pre-processing has to be adjusted or you have to think a whole different procedure IMHO, because it will be hard to tell between the noise and a separated part and to which object that part belongs. – Rui Marques Jan 22 '14 at 22:28
  • Rui, yes, maybe that is true. I think it is more helpful to think in terms of area densities than binary connected components; the area densities are spot on, for example if the binary image is processed with some fairly aggressive lowpass filter. – Alex I Jan 22 '14 at 22:30
0

this is not some basic programming problem, this involves advanced image processing techniques. from what I know, "Image Processing, morphology" are the points you are targeting for. you can take some course in "Image Morphology" understand those basic constructs such as "Dilation, Erosion" etc... then you have the fundamentals solving this issue at hand.

zinking
  • 5,561
  • 5
  • 49
  • 81
0

Since you have the size and orientation, you can draw each ellipse, and use template matching.

see tutorial here.

Eran W
  • 1,696
  • 15
  • 20
  • Eran, thank you, unfortunately I do not have orientation. I do have approximate size only. – Alex I Jan 29 '14 at 10:33
  • so what is 'aspect ratio'? – Eran W Jan 29 '14 at 11:08
  • It is the ratio between major and minor axis of the ellipse, or if you prefer, how elongated it is. I have no idea *which way* it is elongated, or which direction is the long direction. – Alex I Jan 29 '14 at 12:55