I have an image feature extraction problem. The input images are binary (black and white) and may contain blobs of approximately known area and aspect ratio. These need to be fit with ellipses using some best fit algorithm.
Example input:
Desired output:
There may be multiple blobs (zero or more), the number is not known in advance. The approximate area and aspect ratio of all the blobs is known (and is the same). How many are in the image, their position, orientation and actual size are what I'm trying to find. The output should be a best fit ellipse for each blob based on the actual found size and aspect ratio.
What makes this hard is noise and possible overlaps.
Example with noise:
Example with overlap and noise:
The noisy image may have holes in the blobs and also small other blobs scattered around. The small other blobs are not counted because they are too small and do not cover any area densely enough to be considered a real match.
The image with overlap should be counted as two blobs because the area is too big for a single blob to cover it well.
A possible metric that evaluates a potential fit is:
sum over all ellipses of (K1 * percent deviation from expected size + K2 * percent deviation from expected aspect ratio + K3 * percent of ellipse which is not black + K4 * percent overlapped with any other ellipse) + K5 * percent of rest of image which is black
for some suitably chosen parameters K1..K5. A perfect match scores 0.
I can see how to solve this using brute force, for example trying enough different possible fits to sample the search space well. I can't think off-hand of a method much faster than brute force.
I would prefer examples in python and/or opencv. I will try to implement and post any suggested solutions in python. Thanks!
P.S. It cannot be assumed that a blob is connected. There may be enough noise to break it up into discontinuous parts.
P.P.S. The little bits of noise cannot be removed by binary erosion. In some of my images, there are enough interior holes that erosion makes the whole (real) blob disappear if the image is eroded enough to make the noise bits disappear as well.
P.P.P.S. I think that it would be very hard to solve this using any approach based on contours. The data I see in practice has too much edge noise, there can be (and often are) bits of noise that connect separate blobs, or that separate a single blob into several (apparent) connected components. I would like an approach based on areas, since area coverage seems to be much less nosy than the edge shapes.
P.P.P.P.S. As requested, here is an example with a through cut due to noise:
and a sample with lots and lots of noise but nevertheless a distinct blob:
EDIT None of the answers actually solves the problem, although Bharat has suggested a partial solution which does well for non-overlapping blobs. More please :) I will award additional bounty to any actual solutions.