0

I am using this for ocr. This is wrapper for Tesseract ocr(I previosly installed Tesseract itself).

At first i dowloaded it via composer and followed examples in repo and also several posts on SO itself. And all can show is in network tab failed to load response data.

My alternate approaches is that i tried downloading repo itself, then tried to call it from my index.php which for test purposes is situated in same folder where class TesseractOCR is. I tried with images in repo and also tried with black letters on white background images with simple text.

This SO post looks promising, but i'm unsure where OP's file with example code is residing...

use thiagoalessio\TesseractOCR\TesseractOCR;
//or//require "TesseractOCR.php";//if it's in the same dir as test.php
$content = new TesseractOCR('text.png');
$text = $content->run();
echo $text;

Did i miss something obvious? Any help is appreciated.

EDIT1: I tried using in win powershell cli. By putting text.png in directory where tesseract is installed, then calling shell with administrator privileges, subsequently typing in it tesseract text.png output which creates output.txt in same directory with recognized text from that image. So tesseract its working, my implementation with php wrapper is not.

EDIT2: Forgot to add, page itself shows:

This page isn’t working
localhost is currently unable to handle this request.
HTTP ERROR 500

Not sure why it happens.

Edit3:

My code:

try{
    
    //use thiagoalessio\TesseractOCR\TesseractOCR;
    require "./vendor/thiagoalessio/tesseract_ocr/src/TesseractOCR.php";
    echo $temp;//It's value is set in TesseractOCR.php
    $content = new TesseractOCR('text1.png');
    $text = $content->run();
    echo $text;
    
}
catch(Exception $e) {
    echo 'Message: ' .$e->getMessage();
}

Value set in $temp variable is visible through state file path, so why TesseractOCRclass itself isn't?

Edit4: Even if i put absolute path to TesseractOCR.php which holds class, in include statement, it doesn't work. It throws this error:

Fatal error: Uncaught Error: Class 'TesseractOCR' not found in C:\xampp\htdocs\myocr\index.php:10 Stack trace: #0 {main} thrown in C:\xampp\htdocs\myocr\index.php on line 10
This is TesseractOCR.//echoed text from file that holds TesseractOCR class.

Inclusion path:

include ("C:/xampp/htdocs/myocr/vendor/thiagoalessio/tesseract_ocr/src/TesseractOCR.php");

If i use(which is suggested in repo readme, use thiagoalessio\TesseractOCR\TesseractOCR;, then it throws:

Fatal error: Uncaught Error: Class 'thiagoalessio\TesseractOCR\TesseractOCR' not found in C:\xampp\htdocs\myocr\index.php:10 Stack trace: #0 {main} thrown in C:\xampp\htdocs\myocr\index.php on line 10

My question is: How it hits test message, but won't hit TesseractOCR class?

EDIT5: If i require_once "./vendor/autoload.php"; , it throws:

Fatal error: Uncaught thiagoalessio\TesseractOCR\TesseractNotFoundException: Error! The command "tesseract" was not found. Make sure you have Tesseract OCR installed on your system: https://github.com/tesseract-ocr/tesseract The current $PATH is C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\WINDOWS\System32\OpenSSH\;C:\Program Files\Git\cmd;C:\Program Files\nodejs\;C:\xampp\php;C:\ProgramData\ComposerSetup\bin;C:\Users\Eddie\AppData\Local\Microsoft\WindowsApps;;C:\Users\Eddie\AppData\Local\Programs\Microsoft VS Code\bin;C:\Users\Eddie\AppData\Roaming\npm;C:\Users\Eddie\AppData\Roaming\Composer\vendor\bin;C:\Program Files\heroku\bin in C:\xampp\htdocs\myocr\vendor\thiagoalessio\tesseract_ocr\src\FriendlyErrors.php:48 Stack trace: #0 C:\xampp\htdocs\myocr\vendor\thiagoalessio\tesseract_ocr\src\TesseractOCR.php(26): thiagoalessio\TesseractOCR\FriendlyErrors::checkTesseractPresence('tesseract') #1 C:\xampp\ht in C:\xampp\htdocs\myocr\vendor\thiagoalessio\tesseract_ocr\src\FriendlyErrors.php on line 48

Btw, i added its patch to env variable: envpath

  • Any error messages, partial results, dumps, anything that would help us identify the problem? Try to surround your code in `try { ... } catch($e) { var_dump($e); }` block and post the error msg here, if any. – ΔO 'delta zero' Nov 11 '20 at 18:52
  • Also, make sure your text.png exists (`file_exists('text.png');`) in your current working dir (check with `getcwd();`) – ΔO 'delta zero' Nov 11 '20 at 18:55
  • @ΔO'deltazero' No. I tried dumping variable that holds `new TesseractOCR('text1.png')`. Nothing, just network message "failed to load response data".. – Veljko Stefanovic Nov 11 '20 at 18:55
  • Try to catch and display error with `try { ... } catch ...` block. See https://www.php.net/manual/en/internals2.opcodes.catch.php – ΔO 'delta zero' Nov 11 '20 at 18:56
  • @ΔO'deltazero' Tried it. Nothing new. I edited post, take a gander at it again. – Veljko Stefanovic Nov 11 '20 at 19:14
  • Hmm, can you check if `thiagoalessio\TesseractOCR\TesseractOCR` class is loaded? Either with composer's `require '/vendor/autoload.php';` or manually. Also, HTTP 500 most probably means your PHP script throws an error somewhere. You need to catch or display it, please see https://stackoverflow.com/a/845025/3290062 – ΔO 'delta zero' Nov 11 '20 at 19:44
  • @ΔO'deltazero' Display_errors is on and it says: `Fatal error: Uncaught Error: Class 'TesseractOCR' not found in C:\xampp\htdocs\testocr\index.php:7 Stack trace: #0 {main} thrown in C:\xampp\htdocs\testocr\index.php on line 7` which is surprising considering that i put whole path in the require("./vendor/thiagoalessio/tesseract_ocr/src/TesseractOCR.php") in my index.php – Veljko Stefanovic Nov 11 '20 at 19:50
  • Now we're getting somewhere :-) I'd recommend installing the composer package and using composer's autoloading (`require './vendor/autoload.php';`) – ΔO 'delta zero' Nov 11 '20 at 19:52
  • @ΔO'deltazero' As i said, i already did it. – Veljko Stefanovic Nov 11 '20 at 19:58
  • If it still throws the same fatal error, then you must've done it wrong somehow. You may try checking your relative paths exist under your cwd. – ΔO 'delta zero' Nov 11 '20 at 20:00
  • @ΔO'deltazero' When i put `getcwd()` in a variable and then echo that variable in index.php which is well outside of `TesseractOCR` class, it says: `c/xampp/htdocs/myocr` i.e. my project directory. I previously put `require "./vendor/thiagoalessio/tesseract_ocr/src/TesseractOCR.php";`. – Veljko Stefanovic Nov 11 '20 at 20:09
  • Does `c:/xampp/htdocs/myocr/vendor/autoload.php` exist? – ΔO 'delta zero' Nov 11 '20 at 21:35
  • @ΔO'deltazero' Yes. – Veljko Stefanovic Nov 11 '20 at 21:41
  • Can you `require` this file instead of TesseractOCR.php and see if the error message changes? – ΔO 'delta zero' Nov 12 '20 at 00:34
  • @ΔO'deltazero' Yes. See edit. – Veljko Stefanovic Nov 12 '20 at 00:46
  • From the last error message, C:\Program Files\Tesseract-OCR is still missing from $PATH environment variable. You might wanna restart your XAMP or maybe the whole system, since it's Windows :-) Then see wether it works or the error changes. – ΔO 'delta zero' Nov 12 '20 at 19:32

1 Answers1

0

I solved it!. My problem occurs that i didn't know about autoloaders in php. Link that helped me is this.

My project structure is this:

  1. Created project folder myocr.
  2. After previous, downloaded latest stable version of Tesseract and installed it.
  3. Depending on your system, you may be required to add value to your system env variable. You need to do that here

panel

then

p1

then

p2

then

p3

I'm assuming it self explanatory with images provided.

  1. Next is getting TesseractOCR via composer:
    composer require thiagoalessio/tesseract_ocr
  1. Finally, before using code sample in repo, you need to call autoloader. Index.php:
    require_once('./vendor/autoload.php');//<-This!
    use thiagoalessio\TesseractOCR\TesseractOCR;
    
    $content = new TesseractOCR('text1.png');
    $text = $content->run();
    echo $text;

This worked for me. Crucial thing to look after is directory structure. localhost > myocr > index.php with its code, and after using composer you'll get vendor dir. and it's content. Image path in new TesseractOCR('text1.png'); is directory where both index.php and image are located.