4

I'm using JSoup in my project and I've declared the dependency in my POM file. It compiles just fine and runs fine too, but only when I used the jar with all dependencies and change the have the scope of the dependency to compiled.

If I change this scope to provided, then I can still compile just fine, but not run it. It gives me the ClassNotFoundException. I have included the necessary JAR file in the classpath and also the path variables but I'm still facing this problem.

I can get working with the compile option but it's really irking me at the back of my mind why I can't get it running with the provided option, and I would really appreciate it if someone could help me figure why.

Following is the error I am seeing:

java.lang.NoClassDefFoundError: Lorg/jsoup/nodes/Document;
    at java.lang.Class.getDeclaredFields0(Native Method)
    at java.lang.Class.privateGetDeclaredFields(Class.java:2300)
    at java.lang.Class.getDeclaredField(Class.java:1882)
    at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1605)
    at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:50)
    at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:423)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:411)
    at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:308)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1114)
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:330)
    at backtype.storm.utils.Utils.serialize(Utils.java:52)
    at backtype.storm.topology.TopologyBuilder.createTopology(TopologyBuilder.java:94)
    at com.yahoo.amit.wordstorm.WordStormTopology.main(WordStormTopology.java:25)
Caused by: java.lang.ClassNotFoundException: org.jsoup.nodes.Document
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    ... 14 more

Following is my POM file:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.yahoo.amit.wordstorm</groupId>
  <artifactId>wordstorm</artifactId>
  <version>1.0-SNAPSHOT</version>
  <packaging>jar</packaging>

  <name>wordstorm</name>
  <url>http://maven.apache.org</url>

    <repositories>
        <repository>
            <id>clojars.org</id>
            <url>http://clojars.org/repo</url>
        </repository>
    </repositories>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
            <groupId>storm</groupId>
            <artifactId>storm</artifactId>
            <version>0.8.2</version>
            <scope>provided</scope>
        </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>

    <dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.7.2</version>
    <scope>provided</scope>
</dependency>

  </dependencies>
  <build>
    <plugins>
            <!--
            bind the maven-assembly-plugin to the package phase
            this will create a jar file without the storm dependencies
            suitable for deployment to a cluster.
             -->
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <archive>
                        <manifest>
                            <mainClass></mainClass>
                        </manifest>
                    </archive>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>

            </plugin>
        </plugins>
        </build>
</project>

Following are my system variables:

echo $PATH

/Users/programmerman/Summer
Project/apache-maven-3.0.5/bin/:/Users/programmerman/Summer
Project/storm-0.8.2/bin/:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/Users/programmerman/Summer
Project/CLASSPATH/jsoup-1.7.2.jar:/Users/programmerman/Summer
Project/CLASSPATH/*

echo $CLASSPATH

/Users/programmerman/Summer
Project/storm-0.8.2/storm-0.8.2.jar:/Users/programmerman/Summer
Project/storm-0.8.2/lib/*:/Users/programmerman/Summer
Project/storm-0.8.2/conf/storm.yaml:/Users/programmerman/SummerProject/storm-starter-masterPOM/target/storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar:/Users/programmerman/Summer
Project/CLASSPATH/jsoup-1.7.2.jar:/Users/programmerman/Summer
Project/CLASSPATH/*
Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
Programming Noob
  • 1,755
  • 5
  • 19
  • 28
  • This question and answer [here][1] might help you. [1]: http://stackoverflow.com/questions/12859787/noclassdeffounderror-after-converting-simple-java-project-to-maven-project-in-ec –  Jun 12 '13 at 01:45
  • How are you actually starting the application? – noahlz Jun 12 '13 at 03:17
  • @noahlz This is actually a project for an open source tool called Storm, so I'm using this command: storm jar ./target/wordstorm-1.0-SNAPSHOT.jar com.mypack.MainClass – Programming Noob Jun 13 '13 at 00:33
  • What's your actual command-line. You're missing the `java` command. – noahlz Jun 13 '13 at 01:08
  • The storm command basically spits out the java command. This is what it spits out: http://pastebin.com/cqDHpRA0 – Programming Noob Jun 13 '13 at 01:25
  • Got it. missed the `storm` part of the command – noahlz Jun 13 '13 at 11:28

3 Answers3

3

This is as much a question about Maven as it is about Storm and its deployment model. You have to check out what the storm command actually does. First of all, it's actually a Python script that ultimately calls java.

If you look at the function get_classpath(extrajars), you'll note that it does not use the $CLASSPATH evironment variable at all. Rather, it loads the core Storm jars and any jars that you have under a directory lib/ relative to your working directory, as well as config files under ~/.storm

(You will find that ignoring $CLASSPATH is very common in many Java applications. Usually the first thing a "launch script" does is overwrite the CLASSPATH or not use it at all. This is to prevent unknown / unsupported / earlier versions of your jars from causing problems in your application).

As to your application fails when jsoup is declared "provided": when you declare the jar as a provided dependency, it will not be packaged in your "jar with dependencies" assembly. See this question for a good explanation: Difference between maven scope compile and provided for JAR packaging

The tl;dr explanation is that compile scope is shipped with your uber-jar, provided scope isn't, because it's expected to be "provided" by the container you are deploying to. Typically, the "container" is a Java web server, like Tomcat (hence, you should never have to ship JSP or Servlet jars with your web apps in Java). In this case, the "container" that you are expecting to "provide" classes is Storm. However, jsoup is not provided by Storm, hence your error.

Compile-scope classes still need to be shipped with your application because your application will be instantiating / using interfaces, enums, etc.

My recommendation is to just declare jsoup "compile" scope and move on. The alternative will be to write your own bespoke deployment script and/or assembly that puts jsoup under lib/ - essentially the same thing in the end.

Community
  • 1
  • 1
noahlz
  • 10,202
  • 7
  • 56
  • 75
  • Great. Thanks for digging, I really appreciate it. Had you worked with Storm before? – Programming Noob Jun 13 '13 at 19:35
  • I ran storm-starter a while back. I mostly am just watching it longingly (no business case to use it). I know more about Maven than is healthy, however. – noahlz Jun 13 '13 at 19:53
2

Storm script doesn't use the CLASSPATH variable but instead puts all the jars in the STORM_DIR/lib directory in its classpath. So you have 2 choices:

  1. Change the scope of the JSoup dependency to "compile" scope and have it packaged inside the jar with dependencies.
  2. Leave the JSoup dependency in "provided" scope and copy the JSoup jar to STORM_DIR/lib directory so that the storm script will automatically put that jar in its classpath.

I would strongly recommend option 1 and follow the standard Maven approach.

Just FYI this is how Storm script creates the classpath string:

def get_classpath(extrajars):
    ret = get_jars_full(STORM_DIR)
    ret.extend(get_jars_full(STORM_DIR + "/lib"))
    ret.extend(extrajars)
    return normclasspath(":".join(ret))
0

Maven scope provided means the dependencies are included at compilation, but not on runtime. The container / run script is expected to include it explicitly, so I can see you're on the right track.

Other things you can check to fix the problem is:

  1. Check the CLASSPATH environment variable on the shell instance that runs the java program. Although you already have correct CLASSPATH value on your user shell, often you have to create a new shell instance (ie: running a script) and the CLASSPATH variable is not propagated. On UNIX this is typically done using export command
  2. Check the classpath path is valid, has correct filesystem permission, jars are not corrupted
  3. Check the java command you used to run the program. If you specify -cp it might (or not) override the CLASSPATH environment variable
gerrytan
  • 40,313
  • 9
  • 84
  • 99