You can set a character whitelist with config argument to get rid of gibberish characters,and also you can try with different psm options to get better result.
Unfortunately, it is not that easy, I think the only way is applying some image preprocessing and this is my best:
- Firstly I applied some blurring to smoothing:
import cv2
blurred = cv2.blur(img,(5,5))
- Then to remove everything except text, converted image to grayscale and applied thresholding to get only white color which is the text color (I used inverse thresholding to make text black which is the optimum condition for tesseract ocr):
gray_blurred=cv2.cvtColor(blurred, cv2.COLOR_BGR2GRAY)
ret,th1 = cv2.threshold(gray_blurred,239,255,cv2.THRESH_BINARY_INV)

and applied ocr then removed whitespace characters :
txt = pytesseract.image_to_string(th1,lang='eng', config='--psm 12')
txt = txt.replace("\n", " ").replace("\x0c", "")
print(txt)
>>>"WINNING'OLYMPIC GOLD MEDAL IT'S MADE OUT OF RECYCLED ELECTRONICS "
Related topics:
Pytesser set character whitelist
Pytesseract OCR multiple config options
You can also try preprocessing your image to let pytesseract work more accurate and if you want to recognize meaningful words you can apply spell check after ocr:
https://pypi.org/project/pyspellchecker/