3

what approach would you recommend for finding obstacles in a 2D image?

Here are some key points I came up with till now:

I doubt I can use object recognition based on "database of obstacles" search, since I don't know what might the obstruction look like. I assume color recognition might be problematic if the path does not differ a lot from the object itself.

Possibly, adding one more camera and computing a 3D image (like a Kinect does) would work, but that would not run as smooth as I require.

To illustrate the problem; robot can ride either left or right side of the pavement. In the following picture, left side is the correct choice: enter image description here

Mikulas Dite
  • 7,790
  • 9
  • 59
  • 99
  • kinnect has an IR sensor and 1 RGB camera. Since you're developing a robot I guess it's better to equip it with a sensor too, isn't it? – Andrey Sboev May 15 '11 at 17:17
  • @Andrey: The Kinect has an IR sensor and an IR texture projector (which combined allow it to compute depth information) and an RGB camera that it can sync with the depth information so that you know how far in front of the Kinect each RGB pixel sensed is. Just adding an IR sensor isn't quite enough to get a Kinect-like setup. – Eric Perko May 16 '11 at 04:13
  • @Eric, thanks for your note. I think it's not necessary to get a Kinect-like setup but only to detect obstacles. Many cheap and simple robots do it using IR. Am I right? – Andrey Sboev May 16 '11 at 05:07
  • @Andrey: Yup they do do it with IR, though that is with very simple IR rangers that send out one "ping". These IR rangers generally have a pretty wide field of view (and may not return off darker objects or materials that absorb IR), so they can't give you the same information density that a Kinect can. For the price, the Kinect has an amazing amount of information available. One plus for the simpler rangers is that you can use them easily with a microprocessor like the Arduino since they often have simple analog or digital outputs vs. the USB 2.0 on the Kinect. – Eric Perko May 18 '11 at 09:17

1 Answers1

2

If you know what the path looks like, this is largely a classification problem. Acquire a bunch of images of path at different distances, illumination, etc. and manually label the ground in each image. Use this labeled data to train a classifier that classifies each pixel as either "road" or "not road." Depending upon the texture of the road, this could be as simple as classifying each pixels' RGB (or HSV) values or using OpenCv's built-in histogram back-projection (i.e. cv::CalcBackProjectPatch()).

I suggest beginning with manual thresholds, moving to histogram-based matching, and only using a full-fledged machine learning classifier (such as a Naive Bayes Classifier or a SVM) if the simpler techniques fail. Once the entire image is classified, all pixels that are identified as "not road" are obstacles. By classifying the road instead of the obstacles, we completely avoided building a "database of objects".


Somewhat out of the scope of the question, the easiest solution is to add additional sensors ("throw more hardware at the problem!") and directly measure the three-dimensional position of obstacles. In order of preference:

  1. Microsoft Kinect: Cheap, easy, and effective. Due to ambient IR light, it only works indoors.
  2. Scanning Laser Rangefinder: Extremely accurate, easy to setup, and works outside. Also very expensive (~$1200-10,000 depending upon maximum range and sample rate).
  3. Stereo Camera: Not as good as a Kinect, but it works outside. If you cannot afford a pre-made stereo camera (~$1800), you can make a decent custom stereo camera using USB webcams.

Note that professional stereo vision cameras can be very fast by using custom hardware (Stereo On-Chip, STOC). Software-based stereo is also reasonably fast (10-20 Hz) on a modern computer.

Michael Koval
  • 8,207
  • 5
  • 42
  • 53