8

I have been working a self project in image processing and robotics where instead robot as usual detecting colors and picking out the object, it tries to detect the holes(resembling different polygons) on the board. For a better understanding of the setup here is an image: enter image description here

As you can see I have to detect these holes, find out their shapes and then use the robot to fit the object into the holes. I am using a kinect depth camera to get the depth image. The pic is shown below:

enter image description here

I was lost in thought of how to detect the holes with the camera, initially using masking to remove the background portion and some of the foreground portion based on the depth measurement,but this did not work out as, at different orientations of the camera the holes would merge with the board... something like inranging (it fully becomes white). Then I came across adaptiveThreshold function

adaptiveThreshold(depth1,depth3,255,ADAPTIVE_THRESH_GAUSSIAN_C,THRESH_BINARY,7,-1.0);

With noise removal using erode, dilate, and gaussian blur; which detected the holes in a better manner as shown in the picture below. Then I used the cvCanny edge detector to get the edges but so far it has not been good as shown in the picture below.After this I tried out various feature detectors from SIFT, SURF, ORB, GoodFeaturesToTrack and found out that ORB gave the best times and the features detected. After this I tried to get the relative camera pose of a query image by finding its keypoints and matching those keypoints for good matches to be given to the findHomography function. The results are as shown below as in the diagram:

enter image description here

In the end i want to get the relative camera pose between the two images and move the robot to that position using the rotational and translational vectors got from the solvePnP function.

So is there any other method by which I could improve the quality of the holes detected for the keypoints detection and matching?

I had also tried contour detection and approxPolyDP but the approximated shapes are not really good:

enter image description here

I have tried tweaking the input parameters for the threshold and canny functions but this is the best I can get

Also ,is my approach to get the camera pose correct?

UPDATE : No matter what I tried I could not get good repeatable features to map. Then I read online that a depth image is cheap in resolution and its only used for stuff like masking and getting the distances. So , it hit me that the features are not proper because of the low resolution image with its messy edges. So I thought of detecting features on a RGB image and using the depth image to get only the distances of those features. The quality of features I got were literally off the charts.It even detected the screws on the board!! Here are the keypoints detected using GoodFeaturesToTrack keypoint detection.keypoints using GoodFeaturesToTrack. I met an another hurdle while getting the distancewith the distances of the points not coming out properly. I searched for possible causes and it occured to me after quite a while that there was a offset in the RGB and depth images because of the offset between the cameras.You can see this from the first two images. I then searched the net on how to compensate this offset but could not find a working solution.

If anyone one of you could help me in compensate the offset,it would be great!

UPDATE: I could not make good use of the goodFeaturesToTrack function. The function gives the corners in Point2f type .If you want to compute the descriptors we need the keypoints and converting Point2f to Keypoint with the code snippet below leads to the loss of scale and rotational invariance.

for( size_t i = 0; i < corners1.size(); i++ )
{
keypoints_1.push_back(KeyPoint(corners1[i], 1.f));
}

The hideous result from the feature matching is shown below loss of invariance.

I have to start on different feature matchings now.I'll post further updates. It would be really helpful if anyone could help in removing the offset problem.

SidJaw
  • 133
  • 2
  • 10
  • Here are rest of the images http://www.hostingpics.net/viewer.php?id=764686img3.jpg (this is the depth image). – SidJaw Jul 22 '14 at 08:46
  • this is the adaptive thresholded image http://www.hostingpics.net/viewer.php?id=468584adapthresh.png. this is the approxpolyDP image http://www.hostingpics.net/viewer.php?id=196889coo.png And could anyone be kind enough to upvote my question as I need a minimum of 10 reputation to post images. – SidJaw Jul 22 '14 at 08:48
  • Why dont you try template matching since you already know the shape of holes which the robot has to fit in. Note: Template matching will not work good if the size of the object varies. i.e if the size of rectangle vary in different boards or when zoom in /out etc. – Darshan Jul 22 '14 at 14:59
  • You could try [this](https://github.com/bsdnoobz/opencv-code/blob/master/shape-detect.cpp) code for shape detection – Darshan Jul 22 '14 at 15:00
  • @Darshan: Thanks for your reply ,but I want something which is both scale and rotationally invariant.You see that board is actually fixed to a motor which makes the board rotate.So the robot has to track the movement and then place the object. I am using a industrial robot actually (Staubli RX90). I'll anyway look into the link. – SidJaw Jul 22 '14 at 15:07

2 Answers2

0

Compensating the difference between image output and the world coordinates:

You should use good old camera calibration approach for calibrating the camera response and possibly generating a correction matrix for the camera output (in order to convert them into real scales).

It's not that complicated once you have printed out a checkerboard template and capture various shots. (For this application you don't need to worry about rotation invariance. Just calibrate the world view with the image array.)

You can find more information here: http://www.vision.caltech.edu/bouguetj/calib_doc/htmls/own_calib.html

--

Now since I can't seem to comment on the question, I'd like to ask if your specific application requires the machine to "find out" the shape of the hole on the fly. If there are finite amount of hole shapes, you may then model them mathematically and look for the pixels that support the predefined models on the B/W edge image.

Such as (x)^2+(y)^2-r^2=0 for a circle with radius r, whereas x and y are the pixel coordinates.

That being said, I believe more clarification is needed regarding the requirements of the application (shape detection).

ahmet
  • 27
  • 6
  • Hi. Thanks for ur reply. But I have already got the camera intrinsic and distortion coefficient matrices through camera calibration.I am lost in what to do next. – SidJaw Jul 27 '14 at 20:24
  • Hello, I don't know if it's appropriate to ask for updates here, but have you progressed so far? – ahmet Jul 18 '15 at 13:20
0

If you're going to detect specific shapes such as the ones in your provided image, then you're better off using a classifer. Delve into Haar classifiers, or better still, look into Bag of Words.

Using BoW, you'll need to train a bunch of datasets, consisting of positive and negative samples. Positive samples will contain N unique samples of each shape you want to detect. It's better if N would be > 10, best if >100 and highly variant and unique, for good robust classifier training.

Negative samples would (obviously), contain stuff that do not represent your shapes in any way. It's just for checking the accuracy of the classifier.

Also, once you have your classifier trained, you could distribute your classifier data (say, suppose you use SVM).

Here are some links to get you started with Bag of Words: https://gilscvblog.wordpress.com/2013/08/23/bag-of-words-models-for-visual-categorization/

Sample code: http://answers.opencv.org/question/43237/pyopencv_from-and-pyopencv_to-for-keypoint-class/

bad_keypoints
  • 1,382
  • 2
  • 23
  • 45