1

I am new to Apache Spark and Hadoop. Im having issues getting the mongo-hadoop connector working.

I have not done anything else besides install jdk-7, Apache Maven, Scala and Apache Spark

This is whats in my .bashrc

JAVA_HOME='/usr/java/jdk1.7.0_75'
export PATH=$PATH:/usr/local/apache-maven/apache-maven-3.2.5/bin/
MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"
export PATH=$PATH:/usr/local/spark/sbin/
export SCALA_HOME='/usr/share/java/scala'

I used this command to install Apache Spark. Spark shell worked, i was able to run basic examples using the SparkContext

mvn -Pyarn -Phadoop-2.4 -Phive -Phive-thriftserver -Dhadoop.version=2.4.0 -DskipTests clean package

And this command to install Mongo-Hadoop connector (on my home directory) also i pretty much followed this https://github.com/crcsmnky/mongodb-spark-demo

mvn install:install-file     -Dfile=core/build/libs/mongo-hadoop-core-1.3.3-SNAPSHOT.jar     -DgroupId=com.mongodb     -DartifactId=hadoop     -Dversion=1.2.1-SNAPSHOT     -Dpackaging=jar

Now i get this error every time i try to start the spark shell

Successfully started service 'HTTP file server' on port 36427.
java.lang.NoClassDefFoundError: javax/servlet/FilterRegistration

and no sparkContext instance. I would like to know how i could resolve this issue and if i could run code like this example https://github.com/plaa/mongo-spark/blob/master/src/main/scala/ScalaWordCount.scala from the spark shell or do i have to build it with graddle and some how have spark call it?

0 Answers0