0

I have a small Spark project using a JAR file, which I am packaging with Maven.

I am using GSON to work with JSON files, and I specifically need the JsonReader class.

I added the GSON dependency, and when I run it as a standard Java project it works fine. However, when I package it and run it in Spark it complains:

Exception in thread "main" java.lang.NoClassDefFoundError: com/google/gson/stream/JsonReader

Here is the line in question:

reader = new JsonReader(new InputStreamReader(new FileInputStream(
                jsonfile)));

I did import it:

import com.google.gson.*;
import com.google.gson.stream.JsonReader;

Here is my POM file:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>SWETesting</groupId>
  <artifactId>SWETesting</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <build>
    <sourceDirectory>src</sourceDirectory>
    <plugins>
      <plugin>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.1</version>
        <configuration>
          <source/>
          <target/>
        </configuration>
      </plugin>
    </plugins>
  </build>
  <dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>1.2.1</version>
    </dependency>
    <dependency>
        <groupId>org.jsoup</groupId>
        <artifactId>jsoup</artifactId>
        <version>1.8.1</version>
    </dependency>
    <dependency>
        <groupId>com.google.code.gson</groupId>
        <artifactId>gson</artifactId>
        <version>2.3.1</version>
    </dependency>
  </dependencies>
</project>

I even installed libgoogle-gson in my distribution, but that didn't help.

What is the problem here?

Edit: I am submitting the job in standalone mode after building it with Maven:

mvn package
/home/chris/Tools/spark-1.2.1/bin/spark-submit --class PlagiarismCheck --master local[6] target/SWETesting-0.0.1-SNAPSHOT.jar > output.txt
Chris Chambers
  • 1,367
  • 21
  • 39
  • How do you submit your spark job? Could you show us your code completely? – jarandaf Mar 09 '15 at 22:30
  • Added the command that launches Spark. Not sure how showing more code will help as it's not an issue in Eclipse itself, just when I try to launch it in Spark. I don't have Eclipse set up to run it on the Spark server directly. – Chris Chambers Mar 09 '15 at 22:32
  • 1
    I think the problem is that your project dependencies aren't included within `SWETesting-0.0.1-SNAPSHOT.jar`, hence the error. I would try generating a fat jar using `maven-assembly` plugin or similar. – jarandaf Mar 09 '15 at 22:41
  • I was hoping to avoid that, I'm very new to Maven and can't do much more than mvn package at the moment. – Chris Chambers Mar 09 '15 at 22:43
  • @jarandaf I got it working by following your advice, thanks. You mind phrasing that as an answer so I can accept it? – Chris Chambers Mar 09 '15 at 22:49

2 Answers2

1

The problem is that your project dependencies aren't included within SWETesting-0.0.1-SNAPSHOT.jar, hence the error. You can generate a fat jar using maven-assembly plugin or similar, that should fix the issue. You can find an example here.

Community
  • 1
  • 1
jarandaf
  • 4,297
  • 6
  • 38
  • 67
0

Looks like it is referring to the old version of jar. gson jar old version 1.5 doesn't have the package name stream. Check the maven dependency tree and find if any older versions of gson jar exists. If yes clean the project/ remove the jars in the repository

Command to check the maven dependency tree

C:\Projects\Project_name>mvn dependency:tree
Swathi
  • 602
  • 4
  • 17
  • It says: com.google.code.gson:gson:jar:2.3.1:compile. So it's a new version. I also cleaned it with mvn clean, didn't help. – Chris Chambers Mar 09 '15 at 22:39