I am going to extract text from a picture using OpenCV in Python and OCR by pytesseract
. I have an image like this:
Then I have written some code to extract the text from that picture, nut it does not have enough accuracy to extract the text properly.
That is my code:
import cv2
import pytesseract
img = cv2.imread('photo.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_,img = cv2.threshold(img,110,255,cv2.THRESH_BINARY)
custom_config = r'--oem 3 --psm 6'
text = pytesseract.image_to_string(img, config=custom_config)
print(text)
cv2.imshow('pic', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
I have tested cv2.adaptiveThreshold
, but it does not work like cv2.threshold
.
And, finally, this is my result which is does not like my result in the picture:
Color Yellow RBC/hpf 4-6
Appereance Semi Turbid WBC/hpf 2-3
Specific Gravity 1014 Epithelial cells/Lpf 1-2
PH 7 Bacteria (Few)
Protein Pos(+) Casts Negative
Glucose Negative Mucous (Few)
Keton Negative
Blood Pos(+)
Bilirubin Negative
Urobilinogen Negative
Nigitesse 5 ed eg ative
Do you have any way to improve the accuracy?