0

I am trying to run a hadoop job on AWS Elastic Map Reduce using a JAR file. I am using a library called EJML https://code.google.com/p/efficient-java-matrix-library/wiki/EjmlManual. I included it in my project as an external library using project-->Build Path-->Configure Build Path-->Add Extrenal Jars in Eclipse. When I run the project on my local computer everything is fine. However on AWS I get the error,

Exception in thread "main" java.lang.NoClassDefFoundError: org/ejml/simple/SimpleBase
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.util.RunJar.main(RunJar.java:180)
Caused by: java.lang.ClassNotFoundException: org.ejml.simple.SimpleBase
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 3 more

I am wondering what could be going wrong. I had to rebuild the library to target Java 6 instead of 7 because hadoop on AWS only runs on Java 6. Any help/suggestions would be appreciated. Thanks

EDIT: an easy way to solve the problem in eclipse is to choose the export Runnable JAR file option while exporting the project into a JAR.

Timnit Gebru
  • 303
  • 1
  • 7
  • 15
  • unzip your jar and see if EJM is in there and also checkout the Manifest file to see that it is included in the classpath. – Amar Jun 07 '13 at 08:18

2 Answers2

2

The 3rd party dependency isn't included in the job jar by default and hence the error message you are seeing. It works in Eclipse standalone mode as Eclipse knows to add the jar to the classpath at execution time.

You have two choices:

  1. Unpack this jar and repack your classes and the 3rd party dependency jars into a single 'uber' or monolithic jar - maven has a jar-with-dependencies assembly for doing this (if you're using maven, which i'd personally recommend)
  2. Use the -libjars argument combined with the ToolRunner method for submitting jobs - this will ensure your 3rd party jars are submitted with your job

    hadoop jar myJar.jar -libjars ejml.jar MainClass.class

Community
  • 1
  • 1
Chris White
  • 29,949
  • 4
  • 71
  • 93
0

You need to add your jars to the Hadoop classpath in the AWS environment before running your Hadoop job.

In the terminal, do this before running your job,

export $EJML_JARS=<your jars here separated by colon ':'>
export HADOOP_CLASSPATH=$EJML_JARS

e.g.

export EJML_JARS=name1.jar:name2.jar:name3.jar
export HADOOP_CLASSPATH=$EJML_JARS

Then, launch your job.

SSaikia_JtheRocker
  • 5,053
  • 1
  • 22
  • 41