How to detect similar objects in this picture?

Question

I want to find patterns in image. Saying "to find patterns" I mean "to detect similar objects", thus these patterns shouldn't be some high-frequency info like noise. For example, on this image I'd like to get pattern "window" with ROI/ellipse of each object:

I've read advices to use Autocorrelation, FFT, DCT for this problem. As far as I've understood, Autocorrelation and FFT are alternative, not complementary.

First, I don't know if it is possible to get such high-level info in frequency domain?

As I have FFT implemented, I tried to use it. This is spectrogram:

Could you suggest how to further analyze this spectogram to detect objects "window" with their spatial locations?
Is it needed to find the brightest points/lines on spectrogram?
Should the FFT be done for image chunks instead of whole image?
If that't not possible to find such objects with this approach, what would you advice?

Thanks in advance. P.S. Sorry for large image size.

The DFT (what the FFT algorithm computes) will tell you the frequency in which patterns repeat, but it is not easy to use it to find the location of these patters. The autocorrelation is simply multiplying the DFT with its own conjugate, then inverse transforming. Again, it will show you the pattern of repetitions, but will not outline any repeating objects. The DCT doesn't add anything to this mix. — Cris Luengo, Jun 27 '18 at 16:53
@CrisLuengo "will tell you the frequency in which patterns repeat" - can that found frequency be extracted and what result image (inverse transform of only that frequency) will look like? — olha, Jun 27 '18 at 18:39
The inverse transform of a single frequency will look like a sine wave. You can use this frequency to know in which direction the repetition happens, and what the size is of the box that repeats. So you could cut a box of that size and repeat it in that direction. But that box will not be exactly one window, it will likely show parts of two neighboring windows. — Cris Luengo, Jun 27 '18 at 19:43
@CrisLuengo Thank you, that sounds very useful! Could you clarify how to get size/dimensions of the box? — olha, Jun 27 '18 at 20:43
From the autocorrelation, find the peak near the origin (but not the one at the origin). Its distance and direction from the origin gives the direction and length of repetition. But you'll find that, because the pattern in the image changes (directions are different for the three columns of windows, and distances decrease towards the top, where they're further away from the camera), what you get here is an average. You'll also see that the peak in the autocorrelation function is quite broad, proving this. Sorry I can't be more helpful! — Cris Luengo, Jun 27 '18 at 22:04
@Spektre Thank you! I have few questions: 1) Lets say SIFT found 100 key points, and really there are 15 similar objects ("groups"), each contains 5 points (but RANSAC doesn't know which group contain which points). What is "cross match" ( and O(n) ) in this case? 2) Can the SIFT be replaced with Hough transform (it will be just key points, without orientation and scale)? — olha, Jun 28 '18 at 12:51
@Spektre Thanks a lot! I suppose I've understood your idea with "keypoint matches". If I done that, RANSAC may not be needed. — olha, Jun 28 '18 at 15:28
@Spektre Yes, answer would be more readable. :) But thank you anyway! — olha, Jun 28 '18 at 15:33
@Olia_Pavliuk I added the answer and removed obsolete comments ... — Spektre, Jun 28 '18 at 16:04

Spektre · Answer 1 · 2018-06-28T16:13:19.477

Beware this is not my cup of tea so read with extreme prejudice. IIRC for such task are usually used SIFT/SURF + RANSAC methods.

Identify key points of interest of image SIFT/SURF

This will get you list of 2D locations in your image with specific features (which you can handle as integer hash code). I think SIFT (Scale Invariant Feature transform) is ideal for this. They work similarly like our human vision works (identify specific change in some feature and "ignore" the rest of the image). So instead of matching all the pixels of the image we cross match only few of them.
sort by occurrence

each of the found SIFT points have some feature list. if we do a histogram of this features (count how many similar or identical feature points there are) then we can group points with the same occurrence. The idea is that if we got n object placings in the image each of its key points should be n times duplicated in the final image.

So if we have many points with some n times occurrence it hints we got n similar objects in the image. From this we select just these key points for the next step.
find object placings

each object can have different scale,position and orientation. Let assume they got the same aspect ratio. So the corresponding key points in each object should have the same relative properties between the objects (like relative angle between key points, normalized distance, etc).

So the task is to regroup our key points into each object so all the objects have the same key points and the same relative properties.

This can be done by brute force (testing all the combination and checking the properties) or by RANSAC or any other method.

Usually we select one first key point (no matter which) and find 2 others that form the same angle and relative distance ratio (in all of the objects)

so angle is the same and |p1-p0| / |p2-p0| is also the same or close. While grouping realize that key points within objects are more likely closer to each other ... so we can augment our search by distance from the first selected key point.... to decide to which object the key point probably belongs to (if try those first we got high probability we found our combination fast). All the other points pi we can add similarly one by one (using p0,p1,pi)

So I would starting by closest 2 key points ... (this sometimes can be fouled by overlapping or touching mirrored images) as the key point from the neighbor object can be sometimes closer that from the own one ...

After such regrouping just check if all the found objects have the same properties (aspect ratio) ... to visualize them you can find the OBB (Oriented Bounding Box) of the key points (which can be also used for the check)

How to detect similar objects in this picture?

1 Answers1