12

I'm trying to install Tesseract-OCR on my server however when I install all what I believe to be the correct repos. When I try to install it the package is not found

I tried adding rpmforge but to no avail. Any ideas from somebody that has done before or is familiar with adding and searching through repos?

William
  • 191
  • 1
  • 1
  • 11

6 Answers6

12

I used these instructions which worked correctly in Centos

Install Tesseract OCR libs from sources in Centos

Download Leptonica and Teseract sources:

$ wget http://www.leptonica.org/source/leptonica-1.69.tar.gz
$ wget https://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.02.tar.gz

Configure, compile, install libs:

 $ tar xzvf leptonica-1.69.tar.gz      
 $ cd leptonica-1.69      
 $ ./configure
 $ make
 $ sudo make install

 $ tar xzf tesseract-ocr-3.02.02.tar.gz
 $ cd tesseract-3.01
 $ ./autogen.sh
 $ ./configure
 $ make
 $ sudo make install
 $ sudo ldconfig

Download languages (english) and copy to tessdata folder:

$ wget http://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.eng.tar.gz       
$ tar xzf tesseract-ocr-3.02.eng.tar.gz       
$ sudo cp tesseract-ocr/tessdata/* /usr/local/share/tessdata

and enjoy it ;)

mij
  • 532
  • 6
  • 13
Yuseferi
  • 7,931
  • 11
  • 67
  • 103
  • getting this in redhad linux ./configure $ make $ sudo make install configure: WARNING: you should use --build, --host, --target configure: WARNING: invalid host type: $ configure: WARNING: you should use --build, --host, --target configure: WARNING: you should use --build, --host, --target configure: WARNING: invalid host type: $ checking build system type... Invalid configuration `$': machine `$' not recognized configure: error: /bin/sh config/config.sub $ failed – Aadam Oct 06 '16 at 13:11
  • Link https://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.02.tar.gz and http://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.eng.tar.gz giving 404 – Adarsh Tiwari Jul 26 '17 at 15:25
  • For Tesseract releases links you may use this page: https://github.com/tesseract-ocr/tesseract/releases – Maksym Ganenko Sep 15 '17 at 07:01
  • 1
    just an addition, find updated binaries @ --http://www.leptonica.org/download.html/ leptonica-1.76.0.tar.gz https://github.com/tesseract-ocr/tesseract/releases/ tesseract-4.0.0-beta.3.tar.gz – Anu Jul 23 '18 at 19:32
6

I recommend to try installing from rpm here: http://pkgs.org/download/tesseract There are also several dependencies: libpng-devel, libjpeg-devel, libtiff-devel, zlib and leptonica. Last 2 can also be found on RPM site

aboev
  • 332
  • 2
  • 9
5

I have written a bash script to install Tesseract 3.05 on Centos 7. This fetches and installs all dependencies, and also installs language files for English, Hindi, Bengali and Thai.

Code available on GitHub

https://github.com/EisenVault/install-tesseract-redhat-centos

Hope this helps.

Vipul Swarup
  • 323
  • 3
  • 17
4

This worked for me :

/usr/bin/yum --enablerepo epel-testing install tesseract.x86_64 tesseract-langpack-fra.noarch

tesseract is not in the epel repository but in the epel-testing repo witch is not activated by default.

Little Gecko
  • 146
  • 9
3

Install Tesseract OCR libs from sources (UPDATED as on 14th July 2018)

Download Leptonica and Teseract sources:

$ wget http://www.leptonica.com/source/leptonica-1.76.0.tar.gz

$ wget https://sourceforge.net/projects/tesseract-ocr-alt/files/tesseract-ocr-3.02.02.tar.gz

Configure, compile, install Leptonica:

$ tar xzvf leptonica-1.76.0.tar.gz
$ cd leptonica-1.76.0
$ ./configure & make & sudo make install

Configure, compile, install Tesseract:

$ tar xzf tesseract-ocr-3.02.02.tar.gz
$ cd tesseract-ocr
$ ./autogen.sh & ./configure & make & sudo make install & sudo ldconfig

Download language file:

I am downloading english language file(eng.traineddata) here. You can see complete list of language files here and download as per your need. https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-302

Download languages (english) and copy to tessdata folder:

$ wget https://sourceforge.net/projects/tesseract-ocr-alt/files/tesseract-ocr-3.02.eng.tar.gz
$ tar xzf tesseract-ocr-3.02.eng.tar.gz
$ sudo cp tesseract-ocr/tessdata/* /usr/local/share/tessdata

Now your Tesseract OCR is installed and ready to use! Example:

$tesseract /path/to/input/test.jpg /path/to/output/abc.txt -l eng

Enjoy!!!

Neeraj Kumar
  • 506
  • 1
  • 8
  • 19
  • I run all the command but after completion i run to see tesseract version (tesseract -v) it showing bash: tesseract: command not found @Neeraj Kumar – mayur panchal Feb 25 '19 at 11:58
  • Have you completed all these steps successfully without any error: 1. tar xzf tesseract-ocr-3.02.02.tar.gz 2. cd tesseract-ocr 3 ./autogen.sh & ./configure & make & sudo make install & sudo ldconfig – Neeraj Kumar Feb 26 '19 at 12:07
0

enter image description here

yum install --nogpgcheck tesseract

after installation to test enter the following command: tesseract --version

iwilldo
  • 71
  • 1
  • 4