15

PROBLEM

I have a picture that is taken from a swinging vehicle. For simplicity I have converted it into a black and white image. An example is shown below:

Figure 1

The image shows the high intensity returns and has a pattern in it that is found it all of the valid images is circled in red. This image can be taken from multiple angles depending on the rotation of the vehicle. Another example is here:

enter image description here

The intention here is to attempt to identify the picture cells in which this pattern exists.

CURRENT APPROACHES

I have tried a couple of methods so far, I am using Matlab to test but will eventually be implementing in c++. It is desirable for the algorithm to be time efficient, however, I am interested in any suggestions.

SURF (Speeded Up Robust Features) Feature Recognition

I tried the default matlab implementation of SURF to attempt to find features. Matlab SURF is able to identify features in 2 examples (not the same as above) however, it is not able to identify common ones:

enter image description here

I know that the points are different but the pattern is still somewhat identifiable. I have tried on multiple sets of pictures and there are almost never common points. From reading about SURF it seems like it is not robust to skewed images anyway. Perhaps some recommendations on pre-processing here?

Template Matching

So template matching was tried but is definitely not ideal for the application because it is not robust to scale or skew change. I am open to pre-processing ideas to fix the skew. This could be quite easy, some discussion on extra information on the picture is provided further down.

For now lets investigate template matching: Say we have the following two images as the template and the current image:

enter image description here

The template is chosen from one of the most forward facing images. And using it on a very similar image we can match the position:

enter image description here

But then (and somewhat obviously) if we change the picture to a different angle it won't work. Of course we expect this because the template no-longer looks like the pattern in the image:

enter image description here

So we obviously need some pre-processing work here as well.

Hough Lines and RANSAC

Hough lines and RANSAC might be able to identify the lines for us but then how do we get the pattern position?

Other that I don't know about yet

I am pretty new to the image processing scene so i would love to hear about any other techniques that would suit this simple yet difficult image rec problem.

The sensor and how it will help pre-processing

The sensor is a 3d laser, it has been turned into an image for this experiment but still retains its distance information. If we plot with distance scaled from 0 - 255 we get the following image:

enter image description here

Where lighter is further away. This could definitely help us to align the image, some thoughts on the best way?. So far I have thought of things like calculating the normal of the cells that are not 0, we could also do some sort of gradient descent or least squares fitting such that the difference in the distance is 0, that could align the image so that it is always straight. The problem with that is that the solid white stripe is further away? Maybe we could segment that out? We are sort of building algorithms on our algorithms then so we need to be careful so this doesn't become a monster.

Any help or ideas would be great, I am happy to look into any serious answer!

Dima
  • 38,860
  • 14
  • 75
  • 115
Fantastic Mr Fox
  • 32,495
  • 27
  • 95
  • 175
  • How consistent is the pattern? If the size, number of pixels, number of independent pixels, etc... are roughly the same for each image you could try using a simple sliding window function with these criteria. – Ghaul May 06 '14 at 12:11
  • @Ghaul yep, that is the template matching. The problem is that template matching will not deal well with skewed images. – Fantastic Mr Fox May 07 '14 at 04:50
  • that depends on the matching criterion, which is why I suggested you try some similarity measure more robust to rotation, such as the number of non-connected regions in the pattern. From you images above it seems like the pattern has 10-13 non-connected regions clustered close to each other. How common are such clusters? If they are rare, a sliding window that simply counts the number of non-connected regions could be able to identify the pattern. – Ghaul May 07 '14 at 11:58
  • If you upload some test images I can try it out myself and write a proper answer. – Ghaul May 07 '14 at 12:00

6 Answers6

4

Given the poor image quality (low resolution + binarization), I would prefer template matching because it is based on a simple global measure of similarity and does not attempt to do any feature extraction (there are no reliable features in your samples).

But you will need to apply template matching with rotation. One way is to precompute rotated instances of the template, perform matchings for every angle and keep the best.

Trained template

Matched template

It is possible to integrate depth information in the comparison (if that helps).

  • Sounds good, how does this affect the time complexity, i know that the template matching algorithm is O(n*m) where n is the number of pixels in the image and m is the number in the template. Do you think this would take significantly longer? how many rotations do you think this would include? Also how would you suggest dealing with the skew of the image remembering that it is taken from different angles? – Fantastic Mr Fox May 06 '14 at 22:20
  • Complexity will increase to O(n.m.r) where r is the number of rotations. For this case, every 5° should be quite enough. I suggest to ignore the skew. Unless it can be large: in this case, due to poor image quality, this application is just not doable. –  May 07 '14 at 06:37
4

I came up with the following program to segment the regions and hopefully locate the pattern of interest using template matching. I've added some comments and figure titles to explain the flow and some resulting images. Hope it helps.

im = imread('sample.png');
gr = rgb2gray(im);
bw = im2bw(gr, graythresh(gr));

bwsm = imresize(bw, .5);

dism = bwdist(bwsm);
dismnorm = dism/max(dism(:));
figure, imshow(dismnorm, []), title('distance transformed')

eq = histeq(dismnorm);
eqcl = imclose(eq, ones(5));
figure, imshow(eqcl, []), title('histogram equalized and closed')

eqclbw = eqcl < .2; % .2 worked for samples given
eqclbwcl = imclose(eqclbw, ones(5));
figure, imshow(eqclbwcl, []), title('binarized and closed')

filled = imfill(eqclbwcl, 'holes');
figure, imshow(filled, []), title('holes filled')

% -------------------------------------------------
% template
tmpl = zeros(16);
tmpl(3:4, 2:6) = 1;tmpl(11:15, 13:14) = 1;
tmpl(3:10, 7:14) = 1;

st = regionprops(tmpl, 'orientation');
tmplAngle = st.Orientation;
% -------------------------------------------------     

lbl = bwlabel(filled);
stats = regionprops(lbl, 'BoundingBox', 'Area', 'Orientation');
figure, imshow(label2rgb(lbl), []), title('labeled')

% here I just take the largest contour for convenience. should consider aspect ratio and any
% other features that can be used to uniquely identify the shape
[mx, id] = max([stats.Area]);
mxbb = stats(id).BoundingBox;

% resize and rotate the template
tmplre = imresize(tmpl, [mxbb(4) mxbb(3)]);
tmplrerot = imrotate(tmplre, stats(id).Orientation-tmplAngle);

xcr = xcorr2(double(filled), double(tmplrerot));
figure, imshow(xcr, []), title('template matching')

Resized image:

resized

Segmented:

segmented

Template matching:

2d cross-correlation

dhanushka
  • 10,492
  • 2
  • 37
  • 47
4

This is quite similar to the problem of recognising hand-sketched characters that we tackle in our lab, in the sense that the target pattern is binary, low resolution, and liable to moderate deformation.

Based on our experiences I don't think SURF is the right way to go as pointed out elsewhere this assumes a continuous 2D image not binary and will break in your case. Template matching is not good for this kind of binary image either - your pixels need to be only slightly misaligned to return a low match score, as there is no local spatial coherence in the pixel values to mitigate minor misalignments of the window.

Our approach is this scenario is to try to "convert" the binary image into a continuous or "greyscale" image. For example see below:

Converting line-art to continuous field via edge orientation extrapolation

These conversions are made by running a 1st derivative edge detector e.g. convolve 3x3 template [0 0 0 ; 1 0 -1 ; 0 0 0] and it's transpose over image I to get dI/dx and dI/dy. At any pixel we can get the edge orientation atan2(dI/dy,dI/dx) from these two fields. We treat this information as known at the sketched pixels (the white pixels in your problem) and unknown at the black pixels. We then use a Laplacian smoothness assumption to extrapolate values for the black pixels from the white ones. Details are in this paper:

http://personal.ee.surrey.ac.uk/Personal/J.Collomosse/pubs/Hu-CVIU-2013.pdf

If this is a major hassle you could try using a distance transform instead, convenient in Matlab using bwdist, but it won't give as accurate results.

Now we have the "continuous" image (as per right hand column of images above). The greyscale patterns encode the local structure in the image, and are much more amenable to gradient based descriptors like SURF and template matching.

My hunch would be to try template match first, but since this is affine sensitive I would go the whole way and use a HOG/Bag of Visual words approach again just as in our above paper, to match those patterns.

We have found this pipeline to give state of the art results in sketch based shape recognition, and my PhD student has successfully used in subsequent work for matching hieroglyphs, so I think it could have a good shot at working the kind of pattern you pose in your example images.

jcollomosse
  • 673
  • 3
  • 17
3

I would:

  1. segment image

    by Z coordinates (distance from camera/LASER) where Z coordinate jumps more then threshold there is border between object and background (if neighboring Z value is big or out of range) or another object (if neighboring Z value is different) or itself (if neighboring Z value is different but can be connected to itself). This will give you set of objects

  2. align to viewer

    compute boundary points of each object (most outer edges), compute direction via atan2 rotate back to face camera perpendicular.

    Your image looks like flag marker so in that case rotation around Y axis should suffice. Also you can scale size of the object to predefined distance (if the target is always the same size)

    You will need to know the FOV of your camera system and have calibrated Z axis for this.

  3. now try to identify object

    here use what you have by now and also can add filter like skip objects with not matching size or aspect ratio ... you can use DFT/DCT or compare histograms of normalized/equalized image etc. ...

[PS]

for features is not a good idea to use BW-Bit image because you loose too much info. Use gray-scale or color instead (gray-scale is usually enough). I usually add few simplified histograms of small area (with few different radius-es) around point of interest which is invariant on rotation.

Spektre
  • 49,595
  • 11
  • 110
  • 380
  • Sounds good. In step 2, how would you recommend finding the rotation angle such that I can rotate the image? – Fantastic Mr Fox May 01 '14 at 23:11
  • @Ben do not know the shape of the target object but if it is planar find bounding box of image, and get min of 3 points each closest to different edge of bounding box. compute normal of the triangle they form and its coordinates gives you all angle info you need. Also you can find 3 most distant points to each other (sort points by Z and take closest, farest and in the middle). Normal is computed by simple Vector multiplication. do not forget to normalize normal to unit vector size, then the coordinates are cos of angle in that axis – Spektre May 02 '14 at 06:27
3

I do not think SURF is the right approach to use here. SURF is designed to work on regular 2D intensity images, but what you have here is a 3D point cloud. There is an algorithm for point cloud registration called Iterative Closed Point (ICP). There are several implementations on MATLAB File Exchange, such as this one.

Edit The Computer Vision System Toolbox now (as of the R2015b release) includes point cloud processing functionality. See this example for point cloud registration and stitching.

Dima
  • 38,860
  • 14
  • 75
  • 115
  • Thanks for the advice, ICP is the end game and we already use that. It is for fine registration though, the point here is to get a rough solution so that it might seed ICP. – Fantastic Mr Fox May 01 '14 at 23:08
0

Have a look a log-polar template matching, it is rotation and scale invariant: http://etd.lsu.edu/docs/available/etd-07072005-113808/unrestricted/Thunuguntla_thesis.pdf

gosnold
  • 181
  • 6