Step by step object detection with ORB

Question

I must create an Android app that recognizes some objects from the camera (car steering wheel, car wheel). I tried with Haar classifier but without success and I'm running out of time (it's a school project). So I decided to look for another way. I found some other methods for my goal - ORB. I found what should I do in this answer. My problem is that things are messed up in my head. Can you give me a step-by-step answer of what to do to implement the answer from the question in the link I gave:

From extracting the feature points to training the KD tree and using it for every frame from the camera.

Bonus questions: Can you give a definition of feature point? It's something I couldn't exactly understand. Will be the detecting slow using ORB? I know OpenCV can be used in native android, wouldn't that make the things faster?

I need to create this app as soon as possible. Please help!

a feature point is a "point of interest" of sorts, http://en.wikipedia.org/wiki/Feature_detection_%28computer_vision%29#Definition_of_a_feature - could be equivalent to "pixel" in your case. — zapl, Dec 05 '14 at 00:44
Also, the answer to [this question](http://stackoverflow.com/questions/14808429/classification-of-detectors-extractors-and-matchers/14912160#14912160) provides a pretty good summary of descriptors, matchers and the such. It also gives you some good combos to use if you ever feel like trying something other than ORB. — TheOmegaPostulate, Dec 09 '14 at 04:43

score 7 · Accepted Answer · answered Dec 05 '14 at 00:47

7

I am currently developing a similar application. I would recommend getting something working with a single reference image first for a couple of reasons:

It's easier to do and understand if you're just starting out, and you can change it later.
For android applications you have limited processing capabilities so more images = lower fps.

You should have a look at the OpenCV tutorials which are quite helpful. Once you go through the “OpenCV for Android SDK” section and understand the three tutorials you can pretty easily add in functionality that will allow you to analyse the video feed.

The basic logic path I'd recommend following when making the app is:

Read in the reference image.
Create and use your FeatureDetector, DescriptorExtractor and DescriptorMatcher.
Use the above to detect keypoints and then descrive keypoints (the first two, don't forget to convert it to a mat and then to greyscale).
Every time you get a frame from your camera repeat step 3. on it and then compare the keypoints in the images (with the third part of 2.).
Use the result to determine if there is a match (if there is then draw a box around it or something).
Get a new frame.

Try making it to work for a single object and then add in others later. Another thing you could add is a screen at the start to allow users to pick what they want to search for.

Also ORB is reasonably fast, especially compared to SIFT and SURF. I get about 3fps on a HTC One with a single reference image.

answered Dec 05 '14 at 00:47

TheOmegaPostulate

161
1
11

1

Thank you very much for your detailed answer. I hav to ask isn't 3 fps too low frame rate? Also can you give an answer if I write the code in native, wouldn't it be faster? – dephinera Dec 05 '14 at 16:37
It's not great but if you're just doing it for a school project it should be sufficient. Also note that this is the lowest frame rate I'm getting, and have gotten up to around 20 fps. It depends very much on both the processing capabilities of your device and how many features are detected (it's really fast if there aren't many). RE writing in native: I haven't run any sort of experiment on this so I can't say for sure, but I imagine it probably would (although by how much I don't know). – TheOmegaPostulate Dec 07 '14 at 22:42
Thank you. I have one more question. How to make the difference between steering wheel and car tire? You know steering wheels are different, will I be able to detect most of them? I need only to tel if it i a steering wheel, a car tire or not. https://www.youtube.com/watch?v=h2KHje-Pf10 This guy here can detect different faces which aren't very different so this gives e hope. Would you (if you can) tell me how to make the difference between my objects, please? – dephinera Dec 08 '14 at 22:49
I've never really tried to pick up all of a certain type of object before, only specific ones. I think the face detection sample code in the link I provided above **might** search for any face and then see if any of the found faces are known but I can't remember. So you might be able to do it with that, or by creating a library of sample images for both the steering wheel and the tire and do steps 3-5 for each of them. I've never had to try, but [this](http://stackoverflow.com/questions/21849938/how-to-save-opencv-keypoint-features-to-database)) seems to step you through it reasonably well. – TheOmegaPostulate Dec 09 '14 at 04:40
Sorry I keep asking but I'm new at this stuff. Can I use the extracted features with orb to train a SVM and then detect my object with the SVM? Do you have any references to share because I keep searching but didn't found a good explanation? – dephinera Dec 10 '14 at 13:41
It's all good, that's what this site is for. I think you could but my knowledge in that area isn't fantastic. My current research is focused on developing new methods of object detection and I'm looking at using Sparce Representation. I'd recommend asking a new question on StackOverflow specifically about what you just asked to see if someone with a greater knowledge in this area then me is able to help. Sorry I can't be of more use... – TheOmegaPostulate Dec 11 '14 at 03:01
No problem. Thank you very much for your help and attention. Good luck with your research! – dephinera Dec 11 '14 at 09:14

Step by step object detection with ORB

1 Answers1