21

The problem that I am running with is to extract the text out of an image and for this I have used Tesseract v3.02. The sample images from which I have to extract text are related to meter readings. Some of them are with solid sheet background and some of them have LED display. I have trained the dataset for solid sheet background and the results are some how effective.

The major problem I have now is the text images with LED/LCD background which are not recognized by Tesseract and due to this the training set isn't generated.

Can anyone guide me to the right direction on how to use Tesseract with the Seven Segment Display(LCD/LED background) or is there any other alternative that I can use instead of Tesseract.

LED background image 1 LED background image 2 Meter 1 with solid sheet background enter image description here enter image description here

yunas
  • 4,143
  • 1
  • 32
  • 38

3 Answers3

6

https://github.com/upupnaway/digital-display-character-rec/blob/master/digital_display_ocr.py

Did this using openCV and tesseract and the "letsgodigital" trained data

-steps include edge detection and extracting the display using the largest contour. Then threshold image using otsu or binarization and pass it through pytesseracts image_to_string function.

Raymond Ma
  • 71
  • 1
  • 3
5

This seems like an image preprocessing task. Tesseract would really prefer its images to all be white-on-black text in bitmap format. If you give it something that isn't that, it will do its best to convert it to that format. It is not very smart about how to do this. Using some image manipulation tool (I happen to like imagemagick), you need to make the images more to tesseract's satisfaction. An easy first pass might be to do a small-radius gaussian blur, threshold at a pretty low value (you're trying to keep only black, so 15% seems right), and then invert the image.

The hard part then becomes knowing which preprocessing task to do. If you have metadata telling you what sort of display you're dealing with, great. If not, I suspect you could look at image color histograms to at least figure out whether your text is white-on-black or black-on-color. If these are the only scenarios, white-on-black is always solid background, and black-on-color is always seven-segment-display, then you're done. If not, you'll have to be clever. Good luck, and please let us know what you come up with.

Mongoose1021
  • 268
  • 1
  • 6
  • http://stackoverflow.com/questions/9361213/7-segment-display-ocr?rq=1 this stackoverflow question has a link to a c script for reading seven-segment independent of OCR. Probably also worth a look. – Mongoose1021 Jul 17 '13 at 16:30
  • I am using GPUImageLibrary https://github.com/BradLarson/GPUImage. I did exactly the same as you did. I applied gaussian blur and then instead of inverting I did Sharpened the blurred image and the provided to the gaussian it worked to some extent but for images that I have added on position 4 in question. It fails... what sort of filters should be applied ? – yunas Jul 23 '13 at 06:43
  • Is it possible to remove the background of LED ? – yunas Jul 24 '13 at 04:23
  • 1
    The difficult thing about the fourth image is that the background brightness decreases from left to right. I was able to solve this using local adaptive thresholding, called in imagemagick by the function -lat. The idea is to average the pixels in the surrounding area and construct a local threshold value that will separate the foreground from the background. If GPUImageLibrary doesn't have that, it shouldn't be too hard to write yourself. It has the added benefit of still working on flat-background images. On that image, a local adaptive threshold of radius 60-80 pixels worked well. – Mongoose1021 Jul 24 '13 at 16:36
  • Yes you are right, I have applied Gaussaian blur on the image and then applied AdaptiveThreshold the grains or background is removed. – yunas Jul 25 '13 at 06:42
  • But Now I am facing a strange problem which is wrong recognition of the characters i.e. for instance I have attached 5th image it is recognixed wrong by tesseract as it returns "n n g g g q" => and I can't map it to "0 0 0 3 8 9" because of the repeated "g"... any idea how can I fix this ? – yunas Jul 25 '13 at 07:23
  • How are you calling tesseract? From the command line, there are two things you can do. First, make sure it knows there is only one line of text by setting pagesegmode to 7. Second, tell it that every character is a digit by including the config file "digits." The command should look like this: `tesseract img.png out -psm 7 digits` – Mongoose1021 Jul 25 '13 at 16:26
  • Is this working out ok? I'm worried that Tesseract might still not recognize the split zeroes, but I have a few ideas for how to deal with that, if you need them. – Mongoose1021 Jul 31 '13 at 16:08
  • I am sorry I didn't replied to you in a while there was some problem I was facing with leptonica installation... Unfortunately all of the above SS images are returned with wrong results... – yunas Jul 31 '13 at 20:03
  • btw tesseract-ocr-3.02.eng.tar.gz is what I am using... is there anyother for SS images ? – yunas Jul 31 '13 at 20:19
  • There's not another tesseract install that'll work better, no. Though there's always the c script for reading SSD I mentioned earlier. What are you doing that is giving wrong results for everything? – Mongoose1021 Aug 01 '13 at 06:36
  • didnt do anything special... this is my current installation http://tny.cz/a23ea9ff and I run the command tesseract 000389.png out -psm 7 digits and output is 333339... which is strange and for 004200.png output is 55 333 – yunas Aug 01 '13 at 07:27
  • http://chat.stackoverflow.com/rooms/34599/tesseract please join so we can chat there... – yunas Aug 01 '13 at 08:53
4

Take a look at this project:

https://github.com/arturaugusto/display_ocr

There you can download the trained data for 7 segments font and a python script with some pre-process capabilities.

art
  • 181
  • 1
  • 9