8

i need an advice on how to approach this problem. I have some picture data: *.jpg, *.bmp ... and i need to extract the data from it. The data is alphanumeric text. I work in delphi.

TheCodeArtist
  • 21,479
  • 4
  • 69
  • 130
dzibul
  • 612
  • 2
  • 7
  • 20

3 Answers3

13

You will have to head for a OCR (Optical Character Recognition) library. This is a pretty complex procedure, I believe you wouldn't be asking this question if you knew any way to implement this by yourself.

A quick Google yielded this result, maybe it's of help for you: http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=1623&lngWId=7

pdinklag
  • 1,241
  • 1
  • 12
  • 28
  • yes, you are right, the easier way is to find some tool that can do this work for me, but if anyone can point directions from where to start solving this problem manually by coding i'll be also greatful – dzibul Feb 07 '11 at 08:16
  • 4
    @dzibul Are you serious? This is a frightfully hard problem that huge armies of exceedingly clever people have been trying to solve since computers were invented. – David Heffernan Feb 07 '11 at 09:27
  • 2
    @dzibul if you have several man-years of free time and solid background in programming *and* academic knowledge of math, then you will find plenty of information about writing your own recognizer. Otherwise take an existing solution. – Eugene Mayevski 'Callback Feb 07 '11 at 09:32
  • 1
    @david & @eugene: Yes, i know that is a big problem. I wondered if i can do that with cutting letters from picture, than comparing pixels from picture and from a sample letter. The right letter will be one that has the most pixels identical. Since the text from picture is machine text (not handwriting), i figured that it won't be so hard (but i appreciate the suggestion that i'm aiming high) – dzibul Feb 07 '11 at 09:44
  • 3
    @dzibul - Maybe you'd like to have a look at 'SubRip's [source](http://sourceforge.net/projects/subrip/files/subrip/SubRip%201.50%20beta%204/). Developed with Delphi, it's a program that extracts subtitles (converts to text) from video streams. Since your letters are not handwritten a similar approach could help. – Sertac Akyuz Feb 07 '11 at 13:53
  • If it the image is auto-generated there is a solution that doesn't require OCR. I just need to know what you are recognising first. – Misha Feb 08 '11 at 01:34
4

Look here:

https://forums.embarcadero.com/message.jspa?messageID=29331

RBA
  • 12,337
  • 16
  • 79
  • 126
1

Take a look at my answer about NeuroVCL OCR here. There is a lot of useful info and sample Delphi OCR DCU components.

Community
  • 1
  • 1
avra
  • 3,690
  • 19
  • 19