11

I'm creating a Java RESTAPI Spring Boot application that uses spark to get some data from the server. When I try to convert from Dataset to List it fails.

I've tried jdk8 and jdk11 to compile and execute the code but I get the same 'java.lang.IllegalArgumentException: Unsupported class file major version 55', in the past, I've solved this issue by updating Java version, but it's not working for this.

I'm using:

  • JDK 11.0.2

  • Spring Boot 2.1.4

  • Spark 2.4.2

This is the code I'm executing:

Dataset<Row> dataFrame = sparkSession.read().json("/home/data/*.json");
        dataFrame.createOrReplaceTempView("events");
        Dataset<Row> resultDataFrame = sparkSession.sql("SELECT * FROM events WHERE " + predicate); 
        Dataset<Event> eventDataSet = resultDataFrame.as(Encoders.bean(Event.class));
        return eventDataSet.collectAsList();

The query works, actually while debugging you can see information in both resultDataFrame and eventDataSet.

I expect the output to be a proper list of Events, but I'm getting the exception:

[http-nio-8080-exec-2] ERROR org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/].[dispatcherServlet] - Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is java.lang.IllegalArgumentException: Unsupported class file major version 55] with root cause
java.lang.IllegalArgumentException: Unsupported class file major version 55
    at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:166)
    at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:148)
    at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:136)
    at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:237)
    at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:49)
    at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:517)
    at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:500)
    at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
    at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
    at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
    at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
    at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
    at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
    at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
    at org.apache.spark.util.FieldAccessFinder$$anon$3.visitMethodInsn(ClosureCleaner.scala:500)
.....

UPDATE BY COMMENTS: For Java 8, I change pom to aim java 8:

<java.version>1.8</java.version>

And then update project, maven clean, maven install and then run. Getting same version 55 error

frm
  • 657
  • 4
  • 9
  • 22
  • 3
    This suggest the ASM version used by Apache XBean (see the stacktrace) doesn't support Java 11. – Mark Rotteveel Apr 30 '19 at 15:19
  • It doesn't work with Java 8 either, and I don't see anything in the stack trace... other weird thing is that it only happens with some of the methods, not all of them – frm Apr 30 '19 at 15:28
  • 1
    To run on Java 8 you will need to compile with Java 8 or at least targeting Java 8. – Mark Rotteveel Apr 30 '19 at 15:32
  • I've already tried with java8 and it doesn't work – frm Apr 30 '19 at 15:51
  • 1
    If you used Java 8, then it wouldn't be saying "version 55" anymore, so can you descibe how you "tried" to use it? – OneCricketeer Apr 30 '19 at 16:01
  • edited in the question :) – frm Apr 30 '19 at 16:25
  • looke here: https://stackoverflow.com/questions/19654557/how-to-set-specific-java-version-to-maven/19654699 – matanper Apr 30 '19 at 16:30
  • in spring boot applications you change java version in pom with property, that should be enough (at least it always have been) – frm Apr 30 '19 at 16:35
  • I'm running it with eclipse, the JDK configured for this project is 1.8, is there something else that I'm missing? – frm Apr 30 '19 at 16:44
  • are you sure your POM file is correct? - please look at the comment from @matanper which shows how to set the java version in maven – andrew-g-za Apr 30 '19 at 16:47
  • first of all is not recommended to convert from dataset to list (heap is very limited) then you need to be sure that your classes are compiled with the same version of objects in spark cluster, so you need an appropriate version of jdk on your machine (not only as a maven property), and you need that jdk to compile your classes – GJCode Apr 30 '19 at 16:51
  • I fixed it, as someone mentioned (I think that comment has been deleted now) the issue was in my local path, I had JAVA_HOME aiming a symbolic link that I forgot that I changed to another version, it's fixed now. Thanks everyone for the help – frm Apr 30 '19 at 16:53
  • @GJCode, I know it's not recommended, but it needs to return the data to the request. The results have been filtered previously, so it shouldn't give any issues, but thanks for the heads up! – frm Apr 30 '19 at 16:54

3 Answers3

19

Excluding the default XBean artifact from spark-core dependency and adding latest version of XBean artifact, it worked for me.

<dependencies>
    <dependency>
        <groupId>org.apache.xbean</groupId>
        <artifactId>xbean-asm6-shaded</artifactId>
        <version>4.10</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.4.1</version>
        <exclusions>
            <exclusion>
                <groupId>org.apache.xbean</groupId>
                <artifactId>xbean-asm6-shaded</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
</dependencies>
ManoshP
  • 191
  • 1
  • 3
5

The root cause of the issue was a symbolic link that I have aiming the wrong JDK and that's why it wasn't working. JAVA_HOME was aiming a jdk11 and eclipse was running with that.

frm
  • 657
  • 4
  • 9
  • 22
3

Since most python developers spawn out virutalenv for the project, you could use the below snippet to check the versions of different components required for pyspark to work. The reason for the error is incompatible java version. pyspark expects java version 1.8+ and not jdk-11. Major version 55 corresponds to jdk-11 as you can see here

Check the official spark documentation only for version compatibility.

import subprocess

# subprocess to find the java , scala and python version
cmd1 = "java -version"
cmd2 = "scala -version"
cmd3 = "python --version"
cmd4 = "whoami"

arr = [cmd1, cmd2, cmd3, cmd4]

for cmd in arr:
    process = subprocess.Popen(cmd.split(" "), stdout=subprocess.PIPE,stderr=subprocess.PIPE )
    stdout,stderr=process.communicate()
    logging.info(stdout.decode("utf-8") + " | "  + stderr.decode("utf-8"))

logging.info(os.getenv("JAVA_HOME"))
logging.info(os.getenv("HOME"))

You will get the below output:

INFO:root: | openjdk version "1.8.0_252"
OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)
OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)

INFO:root: | Scala code runner version 2.12.2 -- Copyright 2002-2017, LAMP/EPFL and Lightbend, Inc.

INFO:root:Python 3.6.9

INFO:root:training
ForeverLearner
  • 1,901
  • 2
  • 28
  • 51