1

My project is on Laravel Framework. My site is based on translation documents and books, etc. The costumer uploads his source file as pdf, at the backend, The words of pdf should be counted by an OCR to determining final price, so the count of words is very important. The main issue is that OCR's have problem with persian characters. How can you help me with this problem?

  • Check this out, its not a package specified for laravel, however its a pure PHP - Library which you can import it in your project and use its functions to get the content of a file then calculate the word-count. `https://stackoverflow.com/questions/1004478/read-pdf-files-with-php` `http://www.fpdf.org/?lang=en` – Tohid Dadashnezhad Dec 03 '19 at 11:19
  • Word count in javascript is easy for you – A.A Noman Dec 03 '19 at 11:53

1 Answers1

1

Follow my method and I hope you get the right answer you want:

Add PDFParser to your composer.json file and then composer update:

{
    "require": {
        "smalot/pdfparser": "*"
    }
}

Use below code in your controller to get the count of your words:

$parser = new \Smalot\PdfParser\Parser();
$pdf = $parser->parseFile("../public/1.pdf");

$text = $pdf->getText();
$text = trim( $text );
$text = str_replace( " ", "", $text );

echo str_word_count( $text );

Note: Put your PDF file in public folder for test.