Questions tagged [tess4j]

Tess4J is a Java JNA wrapper for Tesseract OCR API.

Description

A Java JNA wrapper for Tesseract OCR API.
Tess4J is released and distributed under the Apache License, v2.0.

Releases Versions.

  • Version 1.3 (released : May 31, 2014)
  • Version 2.0 Beta (released : June 1, 2014)
  • Version 3.4.3 (released: 14 January 2018)

Features:

The library provides optical character recognition (OCR) support for:

  • TIFF, JPEG, GIF, PNG, and BMP image formats
  • Multi-page TIFF images
  • PDF document format

Related tags

Links

Tess4J homepage
Tess4J Github

222 questions
14
votes
3 answers

Image preprocessing with OpenCV before doing character recognition (tesseract)

I'm trying to develop simple PC application for license plate recognition (Java + OpenCV + Tess4j). Images aren't really good (in further they will be good). I want to preprocess image for tesseract, and I'm stuck on detection of license plate…
11
votes
3 answers

Tesseract - ERROR net.sourceforge.tess4j.Tesseract - null

Created a java application that uses Tesseract in order to convert a given image or pdf to a string format, when running it on my machine as a unit test using junit it runs great but when running the full system which is a restFul API run by tomcat…
Adi
  • 2,074
  • 22
  • 26
9
votes
0 answers

Tesseract user-pattern is not applied

I want to do OCR on this image. This is pre-define format. ie first five will characters, then next four will be digits and last will be character. When I execute following command $ tesseract in.png stdout I get output as BDVPD474SQ So, I went for…
Bhushan
  • 1,489
  • 3
  • 27
  • 45
8
votes
1 answer

Tessj4 - Error opening data file ./tessdata/eng.traineddata

I have this problem in my web application in Tomcat 9: Error opening data file ./tessdata/eng.traineddata Please make sure the TESSDATAPREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language…
kete nefrega
  • 221
  • 2
  • 3
  • 7
6
votes
1 answer

java.lang.UnsatisfiedLinkError: The specified module could not be found

I just downloaded Tess4J from http://tess4j.sourceforge.net/ and imported it in netbeans. I am follwoing this url i followed every step properly but when i am trying to execute i am getting below error. Error: Exception in thread "main"…
animal
  • 994
  • 3
  • 13
  • 35
5
votes
0 answers

How can I fine tune tesseract on custom dataset?

I know this question may not be a new one, but training/fine-tuning tesseract is one of the hardest part, I could never find any articles which can explain it properly. All the tutorials or docs no one explained it completely, going through them…
user_12
  • 1,778
  • 7
  • 31
  • 72
5
votes
1 answer

Tess4J: "Invalid calling convention 63" despite correct versions

I try to do OCR and output as PDF using Tess4J and the following code on Linux (Ubuntu 16 Xenial). public void testOcr() throws Exception { File imageFile = new File("/projects/de.conradt.core/tessdata/urkunde.jpg"); ITesseract instance =…
Mathias Conradt
  • 28,420
  • 21
  • 138
  • 192
5
votes
2 answers

Tesseract For Java setting Tessdata_Prefix for executable jar

The ultimate goal of this project is to take the jar and put it in a directory where it uses tesseract and outputs a results directory and the output txt file. I am having some issues with tesseract, though. I am working with tess4j in Java with…
Ian
  • 287
  • 4
  • 17
5
votes
4 answers

Tess4J - Native library (linux-x86-64/libtesseract.so) not found in resource path

I'm using Tess4J (JNA wrapper around tesseract), and trying to call tess.doOCR(myFile) to OCR text from a single-page PDF. I have GhostScript installed (by using yum install ghostscript), gs -h works correctly. My app server is using 64-bit JVM, and…
Don Cheadle
  • 5,224
  • 5
  • 39
  • 54
4
votes
0 answers

Using English with Equation trained data tesseract tess4j

I'm trying to read an image with mathematical equations using tess4j in java. However, I think its overlapping the characters and not able to combine English with Equations. Is this a trained data issue? How can I fix this. Below is my code . …
user3310115
  • 1,372
  • 2
  • 18
  • 48
4
votes
1 answer

How can I use Tess4j with IntelliJ?

I would like to do OCR with java and I use IntelliJ. But I don't know what are the files I need for my project. My code is just a simple OCR: import net.sourceforge.tess4j.Tesseract; import net.sourceforge.tess4j.TesseractException; import…
Gabe
  • 624
  • 8
  • 19
4
votes
2 answers

JAVA Tess4j doOCR() not working, Exception "Invalid memory access"

I'm working in dynamic web project in eclipse, I made a TesseractOCR class that contain: public class TesseractOCR { public TesseractOCR() { } public String doOCR(String file) { System.setProperty("jna.library.path",…
Sherein
  • 947
  • 1
  • 11
  • 23
4
votes
1 answer

Tess4j on Windows 64-bit: exception on multiple threads

I am using tesseract 3 with Java 8 on Windows 64-bit to OCR scanned PDFs. I have followed the instructions on the Tess4j page and have used the 64-bit versions of the required DLLs, and have installed 64-bit Ghostscript. When I run my unit test…
Markos Fragkakis
  • 7,499
  • 18
  • 65
  • 103
3
votes
1 answer

Detecting white text on a bright background with tesseract

I'm having issues reading white text on a bright background, it finds the text itself but it cannot really translate it correctly. The image: The result I keep getting is LanEerus which is not that far off, to be honest. What I'm wondering is what…
Jonathan
  • 685
  • 1
  • 10
  • 30
3
votes
3 answers

Tess4j - Pdf to Tiff to tesseract - "Warning: Invalid resolution 0 dpi. Using 70 instead."

I am usig tess4j (net.sourceforge.tess4j:tess4j:4.4.0) and try OCR on pdf files. So as I understood I have to transform the pdf first to tiff or png (any of those suggested?) what I did like…
timguy
  • 2,063
  • 2
  • 21
  • 40
1
2 3
14 15