0

I'm using a tesseract and opencv, but the recognition of tesseract in cursive writing cannot identified, so I came here for a help if there is a better way to do this in tesseract? Some searches says it needs tensorflow but its too chaotic for me...So can anyone direct me here? I'm just plus this is the code sample below.

I have sample here but it in tessereact open cv and etc

import cv2 
import pytesseract
import numpy as np
import math
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files/Tesseract-OCR/tesseract.exe'
img = cv2.imread('C:/PythonItems/PythonAI/California.jpg')
img = cv2.cvtColor(img,cv2.COLOR_BGR2BGRA)

print(pytesseract.image_to_string(img))

# Detecting characters
# print(pytesseract.image_to_boxes(img))
offset = 20
imgSize = 800

hImg,wImg,_ = img.shape

imgWhite = np.ones((imgSize,imgSize,3),np.uint8) * 255

imgResize = cv2.resize(img,(imgSize,imgSize))
imgResizeShape = imgResize.shape
wGap = math.ceil((imgSize-400)/2)

# imgWhite[:,wGap:imgSize + wGap] = imgResize

# boxes = pytesseract.image_to_boxes(img)
boxes = pytesseract.image_to_data(imgResize)

for x,b in enumerate(boxes.splitlines()):
    print(b)
    if x!= 0:
        b = b.split()
        print(b)
        if len(b) == 12:
            x,y,w,h = int(b[6]),int(b[7]),int(b[8]),int(b[9])

            cv2.rectangle(imgResize,(x,y),(x+w,h+y),(0,0,255),1)
            cv2.putText(imgResize,b[11],(x,y),cv2.FONT_HERSHEY_COMPLEX,1,(50,50,255),2)

cv2.imshow('Result',imgResize)
cv2.waitKey(0)

Note that img = cv2.imread('C:/PythonItems/PythonAI/California.jpg') is a cursive writing image but it cannot detect the letters with tessereact.

I'm expecting the output should detect the images letters or characters on those pictures but I can't do it with tessereact so anyone can help me through this?

But some suggestion tells it is solve in tensorflow and opencv and etc...but I can't find a brief explanation or anything on it. So I was really confused. I'm looking for a help in cursive writing that can identified cursive writings

UPDATE

This is the image:

CALIFORNIA

  • Please share the image with the text that can't be detected by tesseract so others can better understand or reproduce your problem. – Markus Nov 10 '22 at 09:30
  • there is it... I don't have to produce the image since you can try it already in yourself... but fine whatever there is it. Check it for yourself – Xenex Ashura Nov 10 '22 at 14:03
  • I think there is no pre-trained model for this font available on the net. You can try to train your own one (https://tesseract-ocr.github.io/tessdoc/tess5/TrainingTesseract-5.html) or you can test commercial ocr services as described here: https://stackoverflow.com/a/8766958/18667225 – Markus Nov 10 '22 at 19:02

0 Answers0