1

i have a image but it is unable to get the price this is what i have

import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
print(pytesseract.image_to_string("local-filename.jpg"))

output


Nestle Bakers’
Choice Melts
290g/

Choc Bits
200g


Altimate
Salted Caramel
Waffle Cones
12's


~ Seitarium ss, :
et-E Ly y ”.
oss a
=| x
) " 4
oat

.

FruitCo Juice 2 Litres
‘Apple/ Apricot/ Apple, Mange,
‘Banana/ Apple Pea

Cottee’s Jams

Betty Crocker Triple
500g

Sanitarium Weet-bix
750g Chocolate Muffin Mix 500g

 

Ss
>

s

Authentic Thai

; Sweet Chili Sauce
Vanilla em, ‘ 725ml

Dell

cours ® ‘OCOMUT HE


Sandhurst Coconut Milk

Chelsea Berry/ Vanilla
400m!

Icing Sugar 3759

  


Process finished with exit code 0

and this is the image i'am trying to analyze

enter image description here

-what i need is the price of the image with the corresponding name -i am able to extract the name of the product but unable to get the price -how can i achieve this any help would be appreciated please note i am very new at image processing

6563cc10d2
  • 193
  • 1
  • 2
  • 8
  • Wow that's a tough problem! My first suggest would be to not give the full image into the tesseract, but divide it into smaller subimages. For example you can try to find the red circles, cut a subimage of the size of the circle and pass that into tesseract to extract the price text. By this you can tune parameters like character size more easy. – Sparkofska Mar 29 '21 at 09:03
  • the problem is that i can not cut the image i am fetching the image from a online site which keeps updating every week – 6563cc10d2 Apr 01 '21 at 07:15
  • 1
    try finding circles in image using OpenCV (refer: https://www.geeksforgeeks.org/circle-detection-using-opencv-python/), then you might able to get all circle's coordinates, then give them to tesseract with whitelist characters(numbers & currency symbols) – MathanKumar Apr 15 '21 at 10:14

2 Answers2

1

tried two options:

  1. easyocr installed through !pip install easyocr
  2. tesseract installed through(on mac) brew install tesseract

here are the results:

  1. full image, both easyocr and tesseract did not gave the price.
  2. took only the price circle
    enter image description here
image = cv2.imread('795.png')
print(pytesseract.image_to_string(sk1)) # printed spaces i.e no result

import easyocr
reader = easyocr.Reader(['en'],gpu = False) #  load model into memory once
result = reader.readtext(image,detail=0) # resul ['s7.95', 'cach']

easyocr worked better!!

next on image with product description
enter image description here

image = cv2.imread('795 Product.png')
reader.readtext(image,detail=0)
'''
['Nestle',
 'eaa',
 'Nestle',
 'RuS',
 'aa',
 'melts',
 'PARKCHOC',
 'chocbts',
 'Nestle Bakers',
 'S',
 'Choice Melts',
 '290g/',
 'cach',
 'Choc Bits',
 '200g',
 'Nestle',
 '"8628',
 'nelts',
 '(Neste)',
 'JTE CHOC',
 '7.95']
'''
print(pytesseract.image_to_string(image))
'''
Nestle Bakers’
Choice Melts
290g/

Choc Bits
200g
'''

easyocr worked better on these images.

You would need to explore which option you would want to forward with. you can also try the recommendation provided by @nathancy How to process and extract text from image by

simpleApp
  • 2,885
  • 2
  • 10
  • 19
1

Google Vision API gives the best results. Google cloud offers 300$ free credits to every user.

Below is the code snippet for the same.

def detect_text(path):
    """Detects text in the file."""
    from google.cloud import vision
    import io
    client = vision.ImageAnnotatorClient()

    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.Image(content=content)

    response = client.text_detection(image=image)
    texts = response.text_annotations
    print('Texts:')

    for text in texts:
        print('\n"{}"'.format(text.description))

        vertices = (['({},{})'.format(vertex.x, vertex.y)
                    for vertex in text.bounding_poly.vertices])

        print('bounds: {}'.format(','.join(vertices)))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(
                response.error.message))
AVISHEK GARAIN
  • 715
  • 5
  • 6