CBIR indexing strategy

Question

I'm developing a CBIR solution to be integrated in a license plate recognition application. The image matching algorithm is very robust, but as you can imagine the database is huge and the extraction of images for matching from a database is really slow. I've tried to quantize an image in something like a small local feature vector or even a single numerical value, but without sucess. The idea is to index some such value, to allow really fast extraction, while simultaneously reducing greatly the number of matching candidates. I've read a lot of papers on the subject, but most of them address classification and machine learning as a solution. Since I am not seeing how classification can be useful, since all the images are pretty similar to each other (license plate pictures), I would like to discuss ideas with someone who's had a similar problem in the past, or even someone who has some clue on how I can solve this. I've been really trying to engineer my way out of this performance issue for a long time, but without much sucess.

No, it's actually an image fingerprinting application...given two images, we determine if it's the same license plate. — Rafael Matos, Feb 25 '13 at 15:47

score 1 · Answer 1 · answered Feb 25 '13 at 17:47

1

Given the additional information in the comments, I would solve the problem in the following way:

Detect/segment the plate from the image;
Apply OCR in order to extract a string with the letters and number from the plate;
In order the verify if two images corresponds to the same license plate, compare the two strings. Note also that it is much easier and efficient to index strings when compared to multi-dimensional feature vectors.

answered Feb 25 '13 at 17:47

Alceu Costa

9,733
19
65
83

That is a very good suggestion and one we have already thought of. It would indeed solve all our problems, but the thing is, because of business model related stuff, the fingeprinting has to be totally independent from any OCR module :\ But again, a very good suggestion. – Rafael Matos Feb 25 '13 at 17:53
Why does it have to be independent of an OCR module? Is it because you can not use an external library? If the OCR algorithm were implemented by yourselves would it be a valid solution? – Alceu Costa Feb 25 '13 at 18:10
The point is that any feature that you extract from the images (i.e. feature vector) will not be as robust for your task as the license plate characters. – Alceu Costa Feb 25 '13 at 18:12
Exactly! The reason why we can't use is about some business logic imposed by my boss :\ Besides, OCR algorithms wouldn't work that well on U.S.A vanity plates. I'm currently researching image shape quantization, see if I find something. – Rafael Matos Feb 25 '13 at 18:14

CBIR indexing strategy

1 Answers1