3

I have clipped a portion of an image which contains some text.

enter image description here

It is visible to the human eye that the text reads CITX 701771.

However, when I zoom into the image, the edges of the characters don't look that distinct anymore. Some of the letters get joined together and it is messing up my OCR results.

enter image description here

What I am trying to do is to smoothen the edges of the text so that in the zoomed image, there is a clear distinction between all the letters i.e. the edges of all the letters such as I and T become smoother and hopefully not joined together.

I tried exploring Gaussian Blurring techniques after going through this post but it just ended up producing an even more blurred image.

The original image for reference

enter image description here

Community
  • 1
  • 1
Piyush
  • 606
  • 4
  • 16
  • 38
  • can you provide original image ? – Sagar Gautam Mar 06 '17 at 04:32
  • @Sagar: Added to the original post – Piyush Mar 06 '17 at 04:51
  • I'm not sure but enhancement might help you. Some other methods like edge enhancement may help you. – Sagar Gautam Mar 06 '17 at 04:59
  • 1
    The symbols are about 8x10 pixels. That's really small -- I can't quote you the paper (it was related to ALPR), but from what I recall successful segmentation and recognition needed at least double or more than that. You camera seems to be way too far, could you get it closer? Or a higher resolution, I'd aim for 20-30 pixels in height for the symbols. – Dan Mašek Mar 06 '17 at 05:02
  • @Piyush Basically, what I'm saying is that having input of appropriate quality is essential for any CV system (and there are many factors that influence that). Lower quality input either makes it harder to achieve your goal, or outright impossible. In your case, out of the 720 rows in your image, the rail vehicles can appear only in some 180 rows (25%) -- the rest is wasted on sky and ground. With better positioning or a different lens you could capture your object of interest in more detail. – Dan Mašek Mar 06 '17 at 05:16
  • 1
    @DanMašek: I agree about the distance of camera with respect to train. However, due to some locational constraints we weren't able to place our cameras any closer and now I am trying to make the best lemonade out of these lemons. We do plan to use a better resolution camera in future. However, is there no hope for this clipped image and anyway to smoothen its edges out? My goal is to smoothen out the character edges and then feed it to an OCR system to improve the accuracy of detection of characters. – Piyush Mar 06 '17 at 05:22
  • @Piyush OK, but also consider choosing an appropriate lens (a longer focal lenght that what you used now). Usually with higher resolution the price of the camera goes up, and the frame rate goes down. You need to find the right balance of all the parameters to be able to capture what you need. | As to getting the best with what you have -- it might be a lot of work. First of all i'd try to get rid of the perspective distortion and any rotation so that the text is horizontal. Next, scale the ROI (the bit with text), a few times (start with 5, experiment) with good interpolation (CUBIC). ... – Dan Mašek Mar 06 '17 at 05:41
  • Normalize the ROI, so the full dynamic range is used. See how the thresholding works then. – Dan Mašek Mar 06 '17 at 05:43
  • @DanMašek: Sorry I am new to CV. What is meant by normalizing the region of interest? I already binarized the image. Is there some other pre-processing that needs to be done? – Piyush Mar 06 '17 at 05:47
  • 1
    @Piyush Yes, a fair amount of preprocessing, I'd say. Check out [this answer](http://stackoverflow.com/a/36412223/3962537) for some ideas of some of the preprocessing steps. (Keep in mind, that answer was just a tip of the iceberg in getting a decent enough system on a large input set). You should be able to find tutorials on perspective correction and rotation with OpenCV. – Dan Mašek Mar 06 '17 at 06:13

0 Answers0