1

Sample Image I'm going to do a specialized OCR system which recognizes the above dotted numbers. (The sample picture may not contain all special cases - see below. ) We decided to separate the number string and recognize each digit before we put them altogether to form a final result.
The question is:
How to clearly separate all digits with OpenCV or other image algorithms?
Our difficulty lies in:
1. The image I uploaded is a synthesized image, which was produced using handpicked digits with slight morph in order to simulate anomalies in actual use, e.g. some dots are linked as a whole, some dots are eroded, and some dots are biased. We failed using morphology to determine their contours.
2. However, sometimes the digit may skew too much like italics with kerning, making a "clean and complete" bounding box impossible.
Some of the ideas we thought of are:
1. Find a way to draw slanted lines to separate the digits instead of traditional vertical lines. We assume that these dotted numbers should have been straight-up monospace characters, and only shear will occur instead of rotation.
2. If there are any method better than simple morphology that could link the dots of each number together and manage to keep dots of separate digits away, it will also be useful.
EDIT: Please don't comment below the original question. Just submit your answer. I appreciate every help by you, no matter how simple your answer may seem to be.
EDIT: Since the image I provided is somewhat ideal for real situation, a simple morphological operation won't solve the problem. Also, I'm looking for a solution which separates the characters, and linking the dots together is not the only option.

Aurus Huang
  • 372
  • 1
  • 3
  • 17
  • Morph and project to x-axis. https://i.stack.imgur.com/feLk8.png – Kinght 金 Dec 29 '17 at 05:12
  • can you try with dilation followed by connected components and then mask each connected components individually? – flamelite Dec 29 '17 at 05:29
  • 1
    @Silencer Your result seems promising, however it won't always be that promising in real situation. As I said in the main question, this is just an synthesized image describing a general case. We have many uncommon cases when such a simple solution just doesn't work. For example, the number 7's leg extends too much that it inserts into the blank below the previous 9. Or, the lens distortion and perspective distortion makes the dots in the upper part horizontally nearer than those in the lower part. I won't deny that your method should help in some way, but it's not an excellent solution. – Aurus Huang Dec 29 '17 at 05:59
  • 1
    @flamelite Theoretically plausible, actually facing much difficulty. The example I gave could be too ideal. – Aurus Huang Dec 29 '17 at 06:00
  • Possible duplicate of [Connect close-by dots for OCR (some hints asked, e.g. using morphological operations)](https://stackoverflow.com/questions/47391811/connect-close-by-dots-for-ocr-some-hints-asked-e-g-using-morphological-operat) – Dmitrii Z. Dec 29 '17 at 08:40
  • @AurusHuang Were you able to figure out a robust solution? – Ganesh Tata Jun 19 '19 at 11:06

0 Answers0