2

I'm looking for a simple solution that would return a boolean if ANY kind of English text is present in an image file. I wish to use this to detect memes. For example, the following file should be detected as an image with text.

https://i.pinimg.com/originals/d5/13/4b/d5134b891d3903d0f272f6430014f089.gif

I've come across elaborate machine learning techniques using OpenCV but I haven't been able to fully implement it. Is there any quicker, simpler, and just as effective solution for this?

I look forward to your valuable feedback!

Nice Guy
  • 145
  • 2
  • 6
  • 1
    Does this answer your question? [Detect text area in an image using python and opencv](https://stackoverflow.com/questions/37771263/detect-text-area-in-an-image-using-python-and-opencv) – Yunus Temurlenk Jun 02 '20 at 11:03

3 Answers3

3

There is indeed simple way with opencv and pytessaract after installing you will only need to use a few lines in order to get the text

pip install opencv-python

pip install pytesseract

import cv2
import pytesseract

img = cv2.imread('yourimage.jpeg')   

text = pytesseract.image_to_string(img)

Read Text from Image with One Line of Python Code

Also if you don't like the first way you can use Google vision, keep in mind it will return Json and you will extract what you need.

https://cloud.google.com/vision/docs/ocr

Python Client for Google Cloud Vision

Andy
  • 61,948
  • 13
  • 68
  • 95
InUser
  • 1,138
  • 15
  • 22
  • I've ready tried these simple solutions with tesseract. But most of the times the results are incorrect and nowhere true – Nice Guy Jun 02 '20 at 13:32
  • Have you considered cutting just the relevant place? it will help tesserac. For example, keep just the button half part if you need memes – InUser Jun 02 '20 at 13:34
  • @SaiChivukula google vision has great results so you may want to consider it also – InUser Jun 02 '20 at 13:35
  • 2
    Thanks. I'll try cropping out the irrelevant part. I've looked at Google's vision as well. It seems quite nice but its a paid alternative. – Nice Guy Jun 02 '20 at 15:30
  • Try to recognize also the inverted picture. This is the case for light letters on a dark background, as in your example. – Alex Alex Jun 05 '20 at 18:56
1

We can use pytesseract python package for get text form the images. You can easily install like pip install pytesseract

Here is the example code:

import cv2
import pytesseract
image = cv2.imread('test.jpeg')
text = pytesseract.image_to_string(image)
print(text)

Here is my sample image enter image description here

So, the output should be like

IS BITCOIN
GOING TO
$20.000
BY CHRISTMAS?
Karthick Nagarajan
  • 1,327
  • 2
  • 15
  • 27
  • I've ready tried these simple solutions with tesseract. But most of the times the results are incorrect and nowhere near 90% accuracy – Nice Guy Jun 02 '20 at 13:32
0

You can use OpenCV and pytesseract to perform your task.

import cv2
import pytesseract
img = cv2.imread('YOUR_IMAGE_PATH')
text = pytesseract.image_to_string(img)
print(text)
Ransaka Ravihara
  • 1,786
  • 1
  • 13
  • 30
  • You can find out more details here.https://towardsdatascience.com/how-to-extract-text-from-images-with-python-db9b87fe432b – Ransaka Ravihara Jun 02 '20 at 07:02
  • I've ready tried these simple solutions with tesseract. But most of the times the results are incorrect and nowhere near 90% accuracy – Nice Guy Jun 02 '20 at 13:32