-1

I have been trying, unsuccessfull so far, to get Tess4J to work on NetBeans. I am following the tutorial here:- http://tess4j.sourceforge.net/tutorial/

I have followed it word for word, but get this error message saying:-

"Error opening data file ./tessdata/eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language 'eng' Tesseract couldn't load any languages!"

Can someone tell me what this means please, and how to rectify it?

I have a screenshot here of the project in NetBeans:-

enter image description here

Jeremy Watts
  • 147
  • 1
  • 2
  • 8
  • Are you on a windows machine? – sorifiend Aug 26 '18 at 08:25
  • have you considered setting the environment variable that it is complaining about? –  Aug 26 '18 at 09:08
  • @sorifiend Yes, I am – Jeremy Watts Aug 26 '18 at 09:29
  • @feelingunwelcome I have to admit ignorance here. I am doing this whole thing as a project for work. Even though my programming skills aren't bad,I am not the most IT of people. I don't even know what he error is saying, or means, let alone how to address it. I have googled the error, and not found much that sheds any light. – Jeremy Watts Aug 26 '18 at 09:35
  • I have an idea, @sorifiend, and I am being serious here. I will pay you to take remote control of my laptop and configure this for me. I am being serious here. I will paypal you the money. Let me know if you are interested. – Jeremy Watts Aug 27 '18 at 12:45
  • 1
    @JeremyWatts I would rather answer on here only so that you can do it yourself. You need to download "eng.traineddata" from here: https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata and then put it into the "tessdata" folder inside your project folder. – sorifiend Aug 28 '18 at 06:37

3 Answers3

1

Set Data Full Path like below:

process.setDatapath("F:/Jar/Tess4J-3.4.8-src/Tess4J/tessdata");

Or put the tessdata files in project root directory. It should be work fine.

0

The additional required files are missing from your environment variables.

We can see on the instruction page you linked:

Since the DLLs are built using Visual Studio 2015/2017, please ensure you have Visual C++ 2015 Redistributable or VC++ 2017 Redistributable installed.

The fastest way to fix your issues is to make sure you have VC++ 2017 Redistributable installed.

Alternately you could get the required files elsewhere and manually add them to your project location or elsewhere in your classpath.

Edit: If you are not on a windows machine, or simply want to build the library fully, then see here: https://github.com/tesseract-ocr/tesseract/wiki/Compiling

sorifiend
  • 5,927
  • 1
  • 28
  • 45
  • Why the downvote? Please add a comment or make a suggestion if there is a problem. – sorifiend Aug 26 '18 at 09:13
  • I've installed the VC++ 2017 Redistributable, and still no luck running it – Jeremy Watts Aug 26 '18 at 11:56
  • 1
    @JeremyWatts Do you know where your "tessdata" folder is located? If you followed the tutorial they should be in your project's root directory. In Netbeans select Files view and it should show up. If it is not there, then that's your problem. If it is there, then make sure that it contains the eng.traineddata file. – sorifiend Aug 26 '18 at 12:22
  • 1
    Ok, yeah you need to grab the language packs. See this question and answers for a whole lot of good info: https://stackoverflow.com/questions/14800730/tesseract-running-error Specifically you can find them all here: https://github.com/tesseract-ocr/tessdata – sorifiend Aug 26 '18 at 12:26
  • The walk-thrus online that I've seen trying to install this thing, have all started by locating the Tess4J project, and then dragging and dropping files from it into a new project called OCR (or some other name). Surely is it not better to import the Tess4J project into the one you are working on? – Jeremy Watts Aug 27 '18 at 05:18
  • If you are using Tess4J in a linked project instead then that is OK, but you need to inclune the `eng.traineddata` file in that project, because it is currently missing. – sorifiend Aug 27 '18 at 08:44
  • Ok, I have started the whole thing again, using this walkthru:- http://tess4j.sourceforge.net/tutorial/ I have followed the NetBeans example word for word, yet still no luck getting it to run. The eng.traineddata file is definitely in the tessdata folder, yet I am still getting the error message "Failed loading language 'eng' Tesseract could not load any languages!" Yet eng.traneddata is definitely in the tessdata folder. – Jeremy Watts Sep 16 '18 at 08:38
  • Try and put `eng.traneddata` it in the projects root directory then. I can only guess because you said that you "followed all the instructions", but it sounds like your IDE does not know about the tessdata folder for some reason. Also make sure that the tessdata folder is in the same location as the `tess4j.jar` lib file. – sorifiend Sep 16 '18 at 08:42
0

ITesseract instance = new Tesseract();
instance.setDatapath("C:\Users\Tux\Documents\tessdata");

this worked for me.you can put the language file in the 'tessdata' folder. you can create the 'tessdata' folder anywhere