My project is on Laravel Framework. My site is based on translation documents and books, etc. The costumer uploads his source file as pdf, at the backend, The words of pdf should be counted by an OCR to determining final price, so the count of words is very important. The main issue is that OCR's have problem with persian characters. How can you help me with this problem?
Asked
Active
Viewed 668 times
1
-
Check this out, its not a package specified for laravel, however its a pure PHP - Library which you can import it in your project and use its functions to get the content of a file then calculate the word-count. `https://stackoverflow.com/questions/1004478/read-pdf-files-with-php` `http://www.fpdf.org/?lang=en` – Tohid Dadashnezhad Dec 03 '19 at 11:19
-
Word count in javascript is easy for you – A.A Noman Dec 03 '19 at 11:53
1 Answers
1
Follow my method and I hope you get the right answer you want:
Add PDFParser to your composer.json file and then composer update:
{
"require": {
"smalot/pdfparser": "*"
}
}
Use below code in your controller to get the count of your words:
$parser = new \Smalot\PdfParser\Parser();
$pdf = $parser->parseFile("../public/1.pdf");
$text = $pdf->getText();
$text = trim( $text );
$text = str_replace( " ", "", $text );
echo str_word_count( $text );
Note: Put your PDF file in public folder for test.

Mohammadreza Yektamaram
- 1,392
- 16
- 36
-
1
-
-
I tested it. it works for english but not for persian files. How can i fix this problem? – Sanaz Movahed Dec 07 '19 at 12:36
-
I've tested with Persian pdf, and it was okay. Can you test with another PDF file and share your results? @sanaz – Mohammadreza Yektamaram Dec 07 '19 at 13:53