1

I need to read numbers from a digital scale screen which are displayed like this, for example: enter image description here

So I'm using a webcam, Javascript and Tesseract.js to use OCR. But the best result was:

Jo | |
I 10)

When the expected result was 123456, obviously.

Here is some of the resulting JSON:

block: {
    paragraphs: Array(1),
    text: "Jo | |↵I 10)↵",
    confidence: 47.748504638671875,
    baseline: {
        …},
    bbox: {
        …},
    …
}

The code that process the image is the following:

function recognizeFile(file){
    const corePath = window.navigator.userAgent.indexOf("Edge") > -1
        ? 'js/tesseract-core.asm.js'
        : 'js/tesseract-core.wasm.js';


    const worker = new Tesseract.TesseractWorker({
        corePath,
    });

    worker.recognize(file,
        $("#langsel").val()
    )
        .progress(function(packet){
            console.info(packet)
        })
        .then(function(data){
            console.log(data)
        })
}

Is there another option to achieve this? Is there a more appropriated library for these cases?

Thanks in advance.

Manuel Calles
  • 185
  • 1
  • 1
  • 19

2 Answers2

1

Have a look at Text detection on Seven Segment Display via Tesseract OCR or https://github.com/Shreeshrii/tessdata_ssd or search for "seven segment ocr"

user898678
  • 2,994
  • 2
  • 18
  • 17
0

Looks like you are working on typescript / javascript. I did this - pure web app. Like above user mentioned, ssd trained data worked better. I tried a few additional technique, cropping rectangle, applying whitelist etc.

https://github.com/kiichi/meter-cap

Kiichi
  • 121
  • 2
  • 3