2

I am new to Python and OpenCV . I am currently working on OCR using Python and OpenCV without using Tesseract.Till now I have been successful in detecting the text (character and digit) but I am encountering a problem to detect space between words. Eg- If the image says "Hello John", then it detects hello john but cannot detect space between them, so my output is "HelloJohn" without any space between them.My code for extracting contour goes like this(I have imported all the required modules, this one is the main module extracting contour) :

 imgGray = cv2.cvtColor(imgTrainingNumbers, cv2.COLOR_BGR2GRAY)
 imgBlurred = cv2.GaussianBlur(imgGray, (5,5), 0)                        


 imgThresh = cv2.adaptiveThreshold(imgBlurred,                           
                                  255,                                  
                                  cv2.ADAPTIVE_THRESH_GAUSSIAN_C,       
                                  cv2.THRESH_BINARY_INV,                
                                  11,                                   
                                  2)                                    

 cv2.imshow("imgThresh", imgThresh)      

 imgThreshCopy = imgThresh.copy()        

 imgContours, npaContours, npaHierarchy = cv2.findContours(imgThreshCopy,        
                                             cv2.RETR_EXTERNAL,                 
                                             cv2.CHAIN_APPROX_SIMPLE)           

After this I classify the extracted contours which are digits and character. Please help me detecting space between them. Thank You in advance,your reply would be really helpful.

  • So you have your binary image with the letters on it. One approach would be to grow the letters with [dilation](https://docs.opencv.org/3.3.0/d9/d61/tutorial_py_morphological_ops.html) until nearby characters merge, but separate words don't. Then you'll have separate blobs for each word. The contours of those blobs would each be a mask for a single word, and you can mask the original image with each blob individually do just do the OCR on separate words. – alkasm Dec 04 '17 at 07:09

1 Answers1

6

Since you did not give any example images, I just generated a simple image to test with:

h, w = 100, 600
img = np.zeros((h, w), dtype=np.uint8)
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(img, 'OCR with OpenCV', (30, h-30), font, 2, 255, 2, cv2.LINE_AA)

Test image

As I mentioned in the comments, if you simply dilate the image, then the white areas will expand. If you do this with a large enough kernel so that nearby letters merge, but small enough that separate words do not, then you'll be able to extract contours of each word and use that to mask one word at a time for OCR purposes.

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (15, 15))
dilated = cv2.dilate(img, kernel)

Dilated image

To get the mask of each word individually, just find the contours of these larger blobs. You can sort the contours too; vertically, horizontally, or both so that you get the words in proper order. Here since I just have a single line I'll sort just in the x direction:

contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[1]
contours = sorted(contours, key=lambda c: min(min(c[:, :, 0])))

for i in range(len(contours)):

    mask = np.zeros((h, w), dtype=np.uint8)

    # i is the contour to draw, -1 means fill the contours
    mask = cv2.drawContours(mask, contours, i, 255, -1)
    masked_img = cv2.bitwise_and(img, img, mask=mask)

    cv2.imshow('Masked single word', masked_img)
    cv2.waitKey()

    # do your OCR here on the masked image

Word 1

Word 2

Word 3

alkasm
  • 22,094
  • 5
  • 78
  • 94
  • Thank You so much,your code worked...you removed a big burden from my head,again thank you :) – Rishabh Aggarwal Dec 04 '17 at 09:18
  • Can you help me in extending the code , if the text are in multiple lines instead of just one line e.g.like this comment only. – Rishabh Aggarwal Dec 04 '17 at 12:18
  • I suggest taking a similar approach to what I did above. Try dilating with a long horizontal line kernel: `kernel = np.ones((1, 100), dtype=np.uint8)` and see what you get as a response. This will give you a mask for each line. Simply mask each line and then repeat the above for that line. – alkasm Dec 04 '17 at 13:00
  • Once again thank you so much....now my project is complete to be submitted..all thanks to you :) – Rishabh Aggarwal Dec 04 '17 at 16:04
  • Sir, we are trying to extend our project for currency detection using same tools. Is it possible to detect using same tool.I have tried to extract the edges from the currency by eroding it using kernel but due to the background I am not able to detect contour corresponding to number eg 100. So can you please help me in anyway? – Rishabh Aggarwal Dec 13 '17 at 14:24