2

I have been trying to get the text from a text based captcha image and have been able to edit the image to the point where it is readable by pytesseract, but the last part of rotating the skewed text automatically i have not been able to achieve.

If i manually rotate the image to the point where the text is aligned it will be readable by pytesseract.

here is the image before and after being edited

Before

After

if i manually add

im2 = im.rotate(-8, fillcolor = 'white')

the output would work and be detected, but how can i achieve this automatically? enter image description here

Jeru Luke
  • 20,118
  • 13
  • 80
  • 87
Ken_Mon
  • 67
  • 6
  • eventually you can run loop and test `pytesseract` with different angles - `for angel in range(-10, 10, 2): im2 = im.rotate(angle) ; pytesseract(im2, ...)` – furas Feb 14 '21 at 23:10
  • will this detect other angles from different captcha images? the other captchas are rotated randomly. I am new to this ocr thing. – Ken_Mon Feb 15 '21 at 08:09
  • 1
    if you use `range(0, 360)` then it will test image for all angles. But I think there is no need to use more then `range(-45, 45)`. They may gives different results so you can count these results - and result with the biggest count number is your text. Of course it has to check many images so you can use step ie. `range(-45, 45, 3)` to check less angles. – furas Feb 15 '21 at 12:39
  • you can also search questions like [Detect text orientation](https://stackoverflow.com/questions/23783061/detect-text-orientation), or [Using OpenCV, how can I detect text orientation before performing OCR?](https://stackoverflow.com/questions/10122635/using-opencv-how-can-i-detect-text-orientation-before-performing-ocr/10122793) – furas Feb 15 '21 at 12:42
  • ok @furas i see what you are saying with the range of angles, but when the code is running with other captcha images how will it now when it has got the correct captcha text? thanks for the links i will have a read. – Ken_Mon Feb 15 '21 at 13:17

0 Answers0