2

My Java FX program uses PDFBox and processes PDF embedding JPEG2000 images.

I included the required dependencies in pom.xml file as depicted on PDFBox web site. This is an extract of the "dependencies" section (I also tried adding <scope>compile</scope> below each <version> tag except for pdfbox which is included):

<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>2.0.21</version>
    <type>jar</type>
</dependency>

<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>jbig2-imageio</artifactId>
    <version>3.0.2</version>
</dependency>
<dependency>
    <groupId>com.github.jai-imageio</groupId>
    <artifactId>jai-imageio-core</artifactId>
    <version>1.4.0</version>
    <type>jar</type>
</dependency>  
<dependency>
    <groupId>com.github.jai-imageio</groupId>
    <artifactId>jai-imageio-jpeg2000</artifactId>
    <version>1.4.0</version>
</dependency>

And the build section from pom.xml reads as follows (it uses Maven Assembly Plugin to gather all jars as explained here by Baeldung):

<build>

    <resources>
        <resource>
            <directory>src/main/resources</directory>
            <excludes>
                <exclude>linux-x86-64/**</exclude>
            </excludes>                
        </resource>
    </resources>

    <plugins>

        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-assembly-plugin</artifactId>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>single</goal>
                    </goals>
                    <configuration>
                        <archive>
                            <manifest>
                                <mainClass>
                                    ${mainClass}
                                </mainClass>                                    
                            </manifest>
                            <!--Adds custom entries in manifest-->
                            <manifestEntries>

                                <Implementation-Version>${project.version}</Implementation-Version>
                                <Implementation-Title>${project.artifactId}</Implementation-Title>
                                <Built-By>Me</Built-By>
                            </manifestEntries>
                        </archive>
                        <descriptorRefs>
                            <descriptorRef>jar-with-dependencies</descriptorRef>
                        </descriptorRefs>
                    </configuration>
                </execution>
            </executions>
        </plugin>

        <plugin>
            <groupId>org.codehaus.mojo</groupId>
            <artifactId>exec-maven-plugin</artifactId>
            <version>1.6.0</version>
            
            <configuration>
                <mainClass>${mainClass}</mainClass>
            </configuration>
        </plugin>

        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>3.8.0</version>
            <configuration>

                <source>11</source>
                <target>11</target>                    
            </configuration>
        </plugin>           

    </plugins>

</build>

I tested with JUnit in NetBeans the method that uses Java Advanced Imaging I/O Tools and it works as expected.

However, when running the app either from NetBeans (run triangle which leads to mvn "-Dexec.args=-classpath %classpath my.package.MainApp" -DOMP_THREAD_LIMIT=1 -DskipTests=true exec:java) or from command line I get:

Cannot read JPEG2000 image: Java Advanced Imaging (JAI) Image I/O Tools are not installed

If I looked at the project dependency tree in NetBeans, I can see jai-imageio-core-1.4.0.jar and jai-imageio-jpeg2000-1.4.0.jar.

If I look at the generated jar with dependencies, the jai-imageio-core and jpeg2000 are present :

enter image description here

And I built the program with the standard "build" menu from NetBeans which does under the hood an mvn clean install.

I also looked at this question that looks similar, but which does not differ from what I did.

What is happening? Why are the jars found during tests but not when program is run?

Edit : The only difference between program and tests is that program uses JavaFx while JUnit test doesn't. Does it make sense ?

Edit 2 : I made the following experiment : I removed the jai and jpeg2000 dependencies from the parent's pom, ran the test and it failed. So these dependencies are well taken into account but only during tests.

Edit 3 : Error does not show up if I remove tika dependency (but then the program does not work as expected). And although PDFBox 2.0.21 dependency is before Tika parser in Pom, the dependency graph shows a Warning "An older version of PDFBox (2.0.15) is required by TIka parser".

Edit 4: Now I've spotted the problem and it has not to do with the initial title. A long time ago I extended PDFStreamEngine with a PrintImageLocations class in order to get the embedded image size in PDF. This class was called inside a PrintImageLocationsImproved extends PrintImageLocations class (this one is described in a SO answer). I did that to add public methods and get the width and height of the image. Now I wanted to know if it was a JavaFX problem (spoiler : it is not). So I called a method involving the custom PrintImageLocations (watch out this is not exactly following the SO answer cited above) at the beginning of the program. and automagically the program worked as expected in the program business logic part (embedded JPEG2000 could be processed). But if I don't call this custom PrintImageLocations class before it is called by the actual business logic code then any subsequent calls show the "JAI I/O Tools missing" error. So I am still missing one part of the puzzle but I think it is linked to this custom class not loading for whatever reason the JPEG2000 dependency. The stacktrace reads as follows :

at org.apache.pdfbox.filter.Filter.findImageReader(Filter.java:163) at org.apache.pdfbox.filter.JPXFilter.readJPX(JPXFilter.java:92) at org.apache.pdfbox.filter.JPXFilter.decode(JPXFilter.java:58) at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:77) at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:175) at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:163) at org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:236) at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.(PDImageXObject.java:140) at org.apache.pdfbox.pdmodel.graphics.PDXObject.createXObject(PDXObject.java:70) at org.apache.pdfbox.pdmodel.PDResources.getXObject(PDResources.java:426) at myPackage.myProg.PrintImageLocations.processOperator(PrintImageLocations.java:71) at at myPackage.myProg.PrintImageLocationsImproved.processOperator(PrintImageLocationImproved.java:69) at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:503) at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:477)

Edit 5: I removed JAI imageio core from the parent pom.xml and now it is provided by tika-parser and it works in test and in Netbeans. But then it does not work in CLI. What is weird is that -verbose:class in java -Dprism.order=sw --module-path /home/user/pathTo/lib/javafx-sdk-11.0.2/lib --add-modules ALL-MODULE-PATH -verbose:class -jar target/myBig-jar-with-dependencies.jar shows the JPEG2000 jar :

[7,434s][info][class,load] com.github.jaiimageio.jpeg2000.impl.J2KImageReader source: file://path/to/project/target/myJar_with_dependencies.jar [7,434s][info][class,load] javax.imageio.ImageReadParam source: jrt:/java.desktop [7,435s][info][class,load] com.github.jaiimageio.jpeg2000.J2KImageReadParam source: file://path/to/project/target/myJar_with_dependencies.jar [7,435s][info][class,load] com.github.jaiimageio.jpeg2000.impl.J2KImageReadParamJava source: file://path/to/project/target/myJar_with_dependencies.jar [7,435s][info][class,load] com.github.jaiimageio.jpeg2000.impl.J2KMetadata source: file://path/to/project/target/myJar_with_dependencies.jar

Final Edit :

Now I got it : to make it run from test, inside Netbeans, and CLI, I have to remove the direct dependency to jaiimageio-core in pom.xml. Now if I run it with maven /pathToNetbeans/mvn "-Dexec.args=-classpath %classpath my.package.MainApp" -DOMP_THREAD_LIMIT=1 -DskipTests=true exec:java it works as expected (even without replacing my.package.MainApp it works probably because the main entry point is already written in the pom.xml). But running it with java -Dprism.order=sw --module-path /home/user/pathTo/lib/javafx-sdk-11.0.2/lib --add-modules ALL-MODULE-PATH -verbose:class -jar target/myBig-jar-with-dependencies.jar instead of maven it doesn't anymore unless I put the "JPEG2000" and "JAI ImageIO core" jars inside the lib path given to --module-path (but then it is harder to maintain with dependencies clearly in the fat jar and other ones on disk).

HelloWorld
  • 2,275
  • 18
  • 29
  • 2
    You run on Netbeans or in command line? If it is on command line, then it is because Netbeans is adding dependencies to classpath automatically. How do you build the final jar? Are you using maven/gradle to buil it? – pringi Feb 17 '22 at 15:22
  • @pringi thanks for your comment. I tried both and both methods result in the same behavior. I use maven to build it. I will update my question. – HelloWorld Feb 17 '22 at 15:45
  • @TilmanHausherr thanks for your comment. I will update my question. – HelloWorld Feb 17 '22 at 15:48
  • Some of your dependencies have "jar" and some don't, I don't know if this is relevant. What I'm missing in your pom is what I do, is to use the maven-dependency-plugin to copy the relevant jar files into a "lib" subdirectory. – Tilman Hausherr Feb 18 '22 at 03:57
  • @TilmanHausherr I saw that too for the jar but it was not relevant so I let it as it was (it's an "old" project maybe it was needed at some point). Are you implying I should have a maven-dependecy-plugin section in the plugins section ? – HelloWorld Feb 18 '22 at 04:14
  • You should at least try. And then include the lib directory in your classpath. I'm not a maven expert; It took me a lot of time to have a working pom.xml 10 years ago and have using that one ever since (with occasional improvements) See https://pastebin.com/4maHgz4E and adjust. – Tilman Hausherr Feb 18 '22 at 08:07
  • *"Edit 3 : Error does not show off if I remove tika dependency ."* - I'm not really into JAI. But could it be that there is some jar manifest entry or some other META-INF file required for a JAI module, and that TIKA also has that entry / file and overrides the JPEG2000 one? – mkl Feb 18 '22 at 10:30
  • @TilmanHausherr mine is not as old as yours but I haven't touched it since then! Thanks for sharing yours. Mkl it must be something on these lines. I will look at Tika more closely. Though it's weird because Tika comes after pdfbox and JAI in pom.xml, so as far as I've always understood it should not take precedence ("the first one wins"). – HelloWorld Feb 18 '22 at 14:00
  • @mkl The solution is near. Indeed [PDFBOX 2.0 doc](https://tika.apache.org/2.0.0/index.html) reads "This parser no longer warns if the jpeg2000 dependency is not included. Tika now relies on PDFBox to log an error if a jpeg2000 image should be processed but can't because the required external dependency is not available." and I am using 1.18. So Tika may be involved! – HelloWorld Feb 18 '22 at 14:47
  • @TilmanHausherr do you have any idea where I could look at and know why tests are passing whereas same code in JavaFx GUI doesn't ? Because it is misleading (tests passed / production not), I would first prefer to also fail during tests. – HelloWorld Feb 19 '22 at 06:22
  • A test can't predict whether the content of your production classpath is missing something. Tika and PDFBox don't distribute the JPEG2000 decoder for license reasons but can use it in a test, because "using in a build" isn't distributing. – Tilman Hausherr Feb 19 '22 at 12:11
  • @TilmanHausherr You mean that the JPEG2000 dependency could be integrated to PDFBox with a test. But then I am using the same parent pom.xml with a SpringBoot app (Spring framework takes care of building the jar), and there it works as expected. So maybe I should look at how I handle PDF in this other project. What a mess! – HelloWorld Feb 19 '22 at 12:49
  • @mkl A long time ago I got inspired by [your solution](https://stackoverflow.com/a/50588214/6351897) and then created a local `PrintImageLocations` class which offered public methods for getWidth and getHeight (see my Edit 4). Now I think the problem comes from that! – HelloWorld Feb 20 '22 at 04:47
  • This is more and more confusing. I recommend you create a minimal project with the problem that you can share in full and delete this question and create a new one. – Tilman Hausherr Feb 20 '22 at 14:39
  • Hhmmmm, now this sounds like some class loading issue may be involved. Like the default class loader used when running `PrintImageLocations` can find the jpeg2000 jai module and register it while the business logic part has a different class leader that cannot see the module. So if the module is not already registered when executing the business logic, an attempt to find the module fails. And if it already is registered, the registered module can be used without further ado. – mkl Feb 20 '22 at 15:39
  • @TilmanHausherr I will try and create one if it is not a silly configuration issue! – HelloWorld Feb 21 '22 at 03:59
  • @mkl I tried printing `System.out.println(System.getProperty("java.class.path").replace(":","\n"));` the loaded classes when `PrintImageLocationsImproved` is called. Surprisingly during test all the classes (even JPEG2000) appear. On the contrary within the GUI app I only prints a single jar : `/path/to/netbeans/java/maven/boot/plexus-classworlds-2.5.2.jar` which takes care of [the class loading mechanism](https://codehaus-plexus.github.io/plexus-classworlds/index.html). – HelloWorld Feb 21 '22 at 04:07
  • @TilmanHausherr I also tried your suggestion (copying dependencies with maven plugin) without success. The behavior remains the same. – HelloWorld Feb 21 '22 at 05:46
  • @HelloWorld *"plexus-classworlds-2.5.2.jar"* - Have you made sure that the JPEG2000 module is available in classworlds realm of your business logic? – mkl Feb 21 '22 at 08:33
  • @mkl no I haven't because I have never touched at it and could not locate it (it being classwords config file)! But you're right it would surely help. Do you know where to put the config file for classwords ? – HelloWorld Feb 21 '22 at 10:01
  • @mkl I tried to put `classworlds.conf` in /src/main/resources and wrote the following contents `main is my.package.myProgram from app [app] load ${app.home}/lib/jai/*.jar` And I launched my program with `mvn "-Dexec.args=-classpath %classpath myPackage.myProgram.MyMain"` **-Dclassworlds.conf=$APP_HOME/resources/classworlds.conf** `-DskipTests=true org.codehaus.mojo:exec-maven-plugin:1.5.0:java`. But I did not notice any change. – HelloWorld Feb 21 '22 at 13:56
  • @mkl I am not sure classworlds is the way to go, since following their [tutorial](https://codehaus-plexus.github.io/plexus-classworlds/apiusage.html) shows deprecation code. – HelloWorld Feb 21 '22 at 14:13
  • Well, I hadn't known _classworlds_ before reading your comment about the class paths you printed out. Thus, no, I have no idea how to best make it work for you. And looking at its source, it appears to not have developed considerably for a number of years... – mkl Feb 21 '22 at 16:50
  • @mkl I don't know where it comes from, do you see it too in your projects ? I don't know if it's a Netbeans thing (I moved to NB 12 and it's still there) because I haven't set it anywhere in my code. – HelloWorld Feb 22 '22 at 05:07
  • 1
    Maybe from netbeans (I use eclipse). Have you tried running your code using the assembled big jar from the command line? What happens there? – mkl Feb 22 '22 at 06:52
  • Probably related: https://stackoverflow.com/q/28529414/1729265 – mkl Feb 22 '22 at 14:32
  • The jpeg2000 jar needs the related core imaging jar to work. – Tilman Hausherr Feb 25 '22 at 07:12
  • Right @TilmanHausherr, I was expecting to find jaiimageio-**core** (like the name of the jar)! I also tried what you suggested moons ago `System.out.println(Arrays.toString(ImageIO.getReaderFileSuffixes()));` and it showed "[jpg, tiff, bmp, pcx, gif, png, raw, ppm, **jp2**, JBIG2, tif, pgm, wbmp, jpeg, pbm, jb2, JB2, jbig2]" – HelloWorld Feb 25 '22 at 13:47
  • @mkl I found something interesting (see my final Edit) : if the assembled big jar is launched from maven it works as expected (from Netbeans or from CLI). However launching it with `java` does not work as expected ("JAI not installed"). I have to find the equivalent java command to this mvn command. – HelloWorld Feb 25 '22 at 14:27
  • It's a bit of hell here... – mkl Feb 25 '22 at 18:18

1 Answers1

1

Building a jar file with Maven does not include the dependencies in the target jar. This is typically what you want when creating a library to be included in a bigger program. Google Maven Uber Jar to find how to build a runnable jar.

kiwiron
  • 1,677
  • 11
  • 17
  • 1
    Thank you kiwiron for you answer. But the other jars (PDFBox, tika, ...) are included the target jar and even these ones are too, aren't they (see screencapture above) ? Only these ones are missing. Anyway I will look at Uber Jar. – HelloWorld Feb 17 '22 at 19:25
  • 1
    My pom.xml already features Maven Assembly Plugin which ["allows users to aggregate the project output along with its dependencies, modules, site documentation, and other files into a single, runnable package"](https://www.baeldung.com/executable-jar-with-maven#thymeleaf-1). So it should already work, shouldn't it ? – HelloWorld Feb 18 '22 at 03:51