Hadoop 2.7.1 I have an Eclipse (maven) project. I'm able to run wordcount hadoop example, so hadoop is correctly configured. If i try to instance an object that uses Model class from Apache Jena at runtime the following error is thrown:
Exception in thread "main" java.lang.NoClassDefFoundError:
com/hp/hpl/jena/rdf/model/Model
at hadoop.wordcount.WordCount.main(WordCount.java:20)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException: com.hp.hpl.jena.rdf.model.Model
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
The same object (the one which uses Jena) works in a standalone (no hadoop) project. The same error happen when i try to run stats hadoop example. To create the jar i run:
mvn clean package
It seems that if i try to use classes from other libs, those classes will not be "included" in the generated jar. Where i'm wrong? Any suggestions? Now i'm out of ideas!
EDIT #1:
I tried to compile with:
mvn clean compile assembly:single
using this assembly config:
<assembly>
<id>hadoop-job</id>
<formats>
<format>jar</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<dependencySets>
<dependencySet>
<unpack>false</unpack>
<scope>provided</scope>
<outputDirectory>lib</outputDirectory>
<excludes>
<exclude>${groupId}:${artifactId}</exclude>
</excludes>
</dependencySet>
<dependencySet>
<unpack>true</unpack>
<includes>
<include>${groupId}:${artifactId}</include>
</includes>
</dependencySet>
</dependencySets>
</assembly>
but i'm still facing the same problem.
EDIT #2: In my case this worked:
In pom.xml include this plugin
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.3</version>
<configuration>
<shadedArtifactAttached>true</shadedArtifactAttached>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
<exclude>META-INF/LICENSE*</exclude>
<exclude>license/*</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
And then
mvn clean compile package
(solution taken from @Garry in post: Hadoop java.io.IOException: Mkdirs failed to create /some/path )
Now it works, but take a lot of time in "unzip" process. Any known workaround?