0

I want to extract the text of the following image(image-1)

i want to extract text from this image

After applying ostu`s binarisation and adaptive thresholding, it converts the image(image-2) into black and white.I am stuck how to extract characters from this black and white image.

this is the result after applying ostu method and adaptive threshold.

here is the code for the above

import cv2
import pytesseract
img = cv2.imread("puberty.jpg", 0)
ret, thresh = cv2.threshold(img, 10, 255, cv2.THRESH_OTSU)
print("Threshold selected : ", ret)
cv2.imwrite("./pu.jpg", thresh)


pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
data = pytesseract.image_to_string("pu.jpg", lang='eng', config='--psm 6')
print(data)

i am using pytessract to extract the characters but it does work.

nathancy
  • 42,661
  • 14
  • 115
  • 137
  • Have you tried this: https://stackoverflow.com/questions/23506105/extracting-text-opencv/51436780#51436780. I know it's in C++ but generally the API transfers pretty well. – Warpstar22 Jan 14 '20 at 22:53
  • Since the text overlaps, but is in different colors, I would suggest using inRange() to threshold each different color. Then combine the thresholded images or find the text characters for each color separately. – fmw42 Jan 14 '20 at 23:30

0 Answers0