5

I'm trying to look for shapes in an image using OpenCV. I know the shapes I want to match (there are some shapes I don't know about, but I don't need to find them) and their orientations. I don't know their sizes (scale) and locations.

My current approach:

  1. Detect contours
  2. For each contour, calculate the maximum bounding box
  3. Match each bounding box to one of the known shapes separately. In my real project, I'm scaling the region to the template size and calculating differences in Sobel gradient, but for this demo, I'm just using the aspect ratio.

Where this approach comes undone is where shapes touch. The contour detection picks up the two adjacent shapes as a single contour (single bounding box). The matching step will then obviously fail.

Is there a way to modify my approach to handle adjacent shapes separately? Also, is there a better way to perform step 3?

For example: (Es colored green, Ys colored blue)

enter image description here

Failed case: (unknown shape in red)

enter image description here

Source code:

import cv
import sys
E = cv.LoadImage('e.png')
E_ratio = float(E.width)/E.height
Y = cv.LoadImage('y.png')
Y_ratio = float(Y.width)/Y.height
EPSILON = 0.1

im = cv.LoadImage(sys.argv[1], cv.CV_LOAD_IMAGE_GRAYSCALE)
storage = cv.CreateMemStorage(0)
seq = cv.FindContours(im, storage, cv.CV_RETR_EXTERNAL, 
        cv.CV_CHAIN_APPROX_SIMPLE)
regions = []
while seq:
    pts = [ pt for pt in seq ]
    x, y = zip(*pts)    
    min_x, min_y = min(x), min(y)
    width, height = max(x) - min_x + 1, max(y) - min_y + 1
    regions.append((min_x, min_y, width, height))
    seq = seq.h_next()

rgb = cv.LoadImage(sys.argv[1], cv.CV_LOAD_IMAGE_COLOR)
for x,y,width,height in regions:
    pt1 = x,y
    pt2 = x+width,y+height
    if abs(float(width)/height - E_ratio) < EPSILON:
        color = (0,255,0,0)
    elif abs(float(width)/height - Y_ratio) < EPSILON:
        color = (255,0,0,0)
    else:
        color = (0,0,255,0)
    cv.Rectangle(rgb, pt1, pt2, color, 2)

cv.ShowImage('rgb', rgb)
cv.WaitKey(0)

e.png:

enter image description here

y.png:

enter image description here

good:

enter image description here

bad:

enter image description here

Before anybody asks, no, I'm not trying to break a captcha :) OCR per se isn't really relevant here: the actual shapes in my real project aren't characters -- I'm just lazy, and characters are the easiest thing to draw (and still get detected by trivial methods).

mpenkov
  • 21,621
  • 10
  • 84
  • 126
  • Have you considered to define an valid interval for ratio height/width. If all the cases when shapes touch each other result in bounding boxes too wide or too tall, taht could be a clue. – Jav_Rock Jan 09 '12 at 09:21
  • Yes, I've already thought of that. Thanks for mentioning it, though. – mpenkov Jan 09 '12 at 09:50

2 Answers2

4

As your shapes can vary in size and ratio, you should look at scaling invariant descriptors. A bunch of such descriptors would be perfect for your application.

Process those descriptors on your test template and then use some kind of simple classification to extract them. It should give pretty good results with simple shapes as you show.

I used Zernike and Hu moments in the past, the latter being the most famous. You can find an example of implementation here : http://www.lengrand.fr/2011/11/classification-hu-and-zernike-moments-matlab/.

Another thing : Given your problem, you should look at OCR technologies (stands for optical character recognition : http://en.wikipedia.org/wiki/Optical_character_recognition ;)).

Hope this helps a bit.

Julien

jlengrand
  • 12,152
  • 14
  • 57
  • 87
  • Thank you for your comment. Would you calculate moments for each contour? If yes, then how would it solve the problem of merging shapes? The extracted moments wouldn't really match the moments of the trained shape, would they? Also, OCR *per se* isn't really relevant here: the actual shapes in my real project aren't characters -- I'm just lazy, and characters are the easiest thing to draw (and still get detected by trivial methods). Since you're probably not the first or last person to think about OCR when looking at the question, I've added to it a little bit. – mpenkov Jan 09 '12 at 10:30
  • Yes. The problem with the correlation methods is that I don't know the exact size of the images. I can try different template sizes, but it's really not good enough for me -- I need to know the exact size of the shapes. – mpenkov Jan 09 '12 at 14:40
  • Any news? Shape merging is one of the well known issues in Computer Vision :) – jlengrand Jan 11 '12 at 08:58
  • I ended up looking for the best place to split a CC based on my knowledge of the existing shapes and other image features. It's a very specific algorithm and I'm not at all happy with it, but it seems to work for now. – mpenkov Jan 11 '12 at 10:46
2

Have you try Chamfer Matching or contour matching (correspondence) using CCH as descriptor.

Chamfer matching is using distance transform of target image and template contour. not exactly scale invariant but fast.

The latter is rather slow, as the complexity is at least quadratic for bipartite matching problem. on the other hand, this method is invariant to scale, rotation, and probably local distortion (for approximate matching, which IMHO is good for the bad example above).

Peb
  • 151
  • 3