37

A similar question was asked at Running unit tests with Spark 3.3.0 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer, but that question (and solution) was only about unit tests. For me Spark is breaking actually running the program.

According to the Spark overview, Spark works with Java 17. I'm using Temurin-17.0.4+8 (build 17.0.4+8) on Windows 10, including Spark 3.3.0 in Maven like this:

<scala.version>2.13</scala.version>
<spark.version>3.3.0</spark.version>
...
<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-core_${scala.version}</artifactId>
  <version>${spark.version}</version>
</dependency>

<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-sql_${scala.version}</artifactId>
  <version>${spark.version}</version>
</dependency>

I try to run a simple program:

final SparkSession spark = SparkSession.builder().appName("Foo Bar").master("local").getOrCreate();
final Dataset<Row> df = spark.read().format("csv").option("header", "false").load("/path/to/file.csv");
df.show(5);

That breaks all over the place:

Caused by: java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x59d016c9) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x59d016c9
    at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala:213)
    at org.apache.spark.storage.BlockManagerMasterEndpoint.<init>(BlockManagerMasterEndpoint.scala:114)
    at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:353)
    at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:290)
    at org.apache.spark.SparkEnv$.create(SparkEnv.scala:339)
    at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:194)
    at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:279)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:464)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704)
    at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
    at scala.Option.getOrElse(Option.scala:201)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)

Spark is obviously doing things one is not supposed to do in Java 17.

Disappointing. How do I get around this?

Garret Wilson
  • 18,219
  • 30
  • 144
  • 272
  • Not much of a choice : you need to add the `--add-opens` options cited in the linked post to your program launch command. I find it strange that Spark has not already addressed such problem though. – amanin Aug 26 '22 at 06:35
  • 1
    IMO it would be better for you to downgrade to JDK 8 or JDK 11 if you can. JDK 17 support was just recently added so this might not be your last issue with that... – Guy Melul Aug 29 '22 at 11:13
  • FWIW, it actually broke for me in 3.2.3 and appeared fixed in 3.3.1. – combinatorist Jan 09 '23 at 19:20
  • it happens on 3.2.2 too; i have to use 3.2.2 due to spark-excel dependency – soMuchToLearnAndShare Mar 02 '23 at 14:30

7 Answers7

23

Solution

A similar question was asked at Running unit tests with Spark 3.3.0 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer, but that question (and solution) was only about unit tests. For me Spark is breaking actually running the program.

Please, consider adding the appropriate Java Virtual Machine command-line options.
The exact way to add them depends on how you run the program: by using a command line, an IDE, etc.

Examples

The command-line options have been taken from the JavaModuleOptions class: spark/JavaModuleOptions.java at v3.3.0 · apache/spark.

Command line

For example, to run the program (the .jar file) by using the command line:

java \
    --add-opens=java.base/java.lang=ALL-UNNAMED \
    --add-opens=java.base/java.lang.invoke=ALL-UNNAMED \
    --add-opens=java.base/java.lang.reflect=ALL-UNNAMED \
    --add-opens=java.base/java.io=ALL-UNNAMED \
    --add-opens=java.base/java.net=ALL-UNNAMED \
    --add-opens=java.base/java.nio=ALL-UNNAMED \
    --add-opens=java.base/java.util=ALL-UNNAMED \
    --add-opens=java.base/java.util.concurrent=ALL-UNNAMED \
    --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED \
    --add-opens=java.base/sun.nio.ch=ALL-UNNAMED \
    --add-opens=java.base/sun.nio.cs=ALL-UNNAMED \
    --add-opens=java.base/sun.security.action=ALL-UNNAMED \
    --add-opens=java.base/sun.util.calendar=ALL-UNNAMED \
    --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED \
    -jar <JAR_FILE_PATH>

IDE: IntelliJ IDEA

References:

References

  • 2
    Thanks for the response, but it's a pity no one investigates this further Surely those options (copied from an email thread) are overkill. I imagine most of options would work with `--add-exports` instead of `--add-opens` (see [docs](https://docs.oracle.com/en/java/javase/17/migrate/migrating-jdk-8-later-jdk-releases.html)), because surely Spark isn't using reflection on all those packages. For a simple use case of reading CSV files and saving to JSON locally, just --add-exports java.base/sun.nio.ch=ALL-UNNAMED is working for me. – Garret Wilson Aug 30 '22 at 16:47
  • Does anyone intend to fix this? Is there a Spark ticket filed? – Garret Wilson Aug 30 '22 at 16:48
  • Dear @GarretWilson, I have updated the answer to specify that the command-line options have been taken from the `JavaModuleOptions` class: [spark/JavaModuleOptions.java at v3.3.0 · apache/spark](https://github.com/apache/spark/blob/v3.3.0/launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java). – Sergey Vyacheslavovich Brunov Aug 30 '22 at 23:39
  • @GarretWilson, I don't have such information. I did a quick search and found a related ticket: [\[SPARK-35557\] Adapt uses of JDK 17 Internal APIs - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-35557). Please, note the [solution comment](https://issues.apache.org/jira/browse/SPARK-35557?focusedCommentId=17441883&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17441883): `--add-opens` is mentioned as the solution. Maybe, it is worth reopening the ticket or opening a new one. – Sergey Vyacheslavovich Brunov Aug 31 '22 at 00:42
  • 1
    I'm going to assign the bounty to this answer as you put a lot of work into it and it gives some good references. Still it doesn't provide a sufficient solution or more in-depth tests for me consider it the accepted answer. Sure, I know I can cram a lot of coarse, brute-force exceptions and figure one of them will cover the Spark limitations. I'm looking for something more finely tuned, and a path forward for getting this fixed in Spark. – Garret Wilson Sep 02 '22 at 13:09
  • 1
    Thanks for the response, it worked from intelliJ, who launches the app using `java -jar`. If I am working in a cluster, do you know if I should set this options as driver and executor extra java options? I tried it on the code (on the Spark Application builder) but here it does not seem to be working – dhalfageme Sep 12 '22 at 07:34
  • You are a life saver this is necessary for sparkNLP if you are trying to use it with java and I had to add it to vm arguments with run for eclipse. – thekevshow Feb 01 '23 at 06:22
19

Following step helped me to unblock the issue.

If you are running the application from IDE (intelliJ IDEA) follow the below instructions.

Add the JVM option "--add-exports java.base/sun.nio.ch=ALL-UNNAMED"

enter image description here

source: https://arrow.apache.org/docs/java/install.html#java-compatibility

mushipao
  • 376
  • 5
  • 7
Anil Reddaboina
  • 221
  • 1
  • 4
4

These three methods work for me on a project using

  • Spark 3.3.2
  • Scala 2.13.10
  • Java 17.0.6 (my project is small, it even works on Java 19.0.1. However, if your project is big, it is better to wait Spark officially supports it)

Method 1

export JAVA_OPTS='--add-exports java.base/sun.nio.ch=ALL-UNNAMED'
sbt run

Method 2

Create a .jvmopts file in your project folder, with content:

--add-exports java.base/sun.nio.ch=ALL-UNNAMED

Then you can run

sbt run

Method 3

If you are using IntelliJ IDEA, this is based on @Anil Reddaboina's answer, and thanks!

This adds more info as I don't have that "VM Options" field by default.

Follow this:

enter image description here

Then you should be able to add --add-exports java.base/sun.nio.ch=ALL-UNNAMED to "VM Options" field.

or add fully necessary VM Options arguments:

--add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED

enter image description here

SBotirov
  • 13,872
  • 7
  • 59
  • 81
Hongbo Miao
  • 45,290
  • 60
  • 174
  • 267
3

Add this as explicit dependency in Pom.xml file. Don't change version other than 3.0.16

<dependency>
    <groupId>org.codehaus.janino</groupId>
    <artifactId>janino</artifactId>
    <version>3.0.16</version>
</dependency>

and then add the command line arguments. If you use VS code, add

"vmArgs": "--add-exports java.base/sun.nio.ch=ALL-UNNAMED"

in configurations section in launch.json file under .vscode folder in your project.

balu mahendran
  • 119
  • 1
  • 4
  • 1
    `vmArgs` param should go in `launch.json` as per the [docs](https://code.visualstudio.com/docs/java/java-debugging#_launch) – CᴴᴀZ Nov 23 '22 at 09:04
1

You could use JDK 8. You maybe should really.

But if you can't you might try adding to your build.sbt file these java options. For me they were needed for tests so I put them into:

val projectSettings = Seq(
...
  Test / javaOptions ++= Seq(
    "base/java.lang", "base/java.lang.invoke", "base/java.lang.reflect", "base/java.io", "base/java.net", "base/java.nio",
    "base/java.util", "base/java.util.concurrent", "base/java.util.concurrent.atomic",
    "base/sun.nio.ch", "base/sun.nio.cs", "base/sun.security.action",
    "base/sun.util.calendar", "security.jgss/sun.security.krb5",
  ).map("--add-opens=java." + _ + "=ALL-UNNAMED"),
...
dlamblin
  • 43,965
  • 20
  • 101
  • 140
  • I'm really curious about this, because it's the only answer targeted toward tests, but it didn't work for me. Would you be willing to link a minimum working example or the rest of your sparkConf/sparkSession.builder command or something? – combinatorist Feb 02 '23 at 22:04
  • Wow - after a ton of work, I figured out how to fix this by following your first suggestion to just use java 8. I'll try to post more tips for other soon. Thanks for that tip! – combinatorist Feb 13 '23 at 19:46
  • 1
    @combinatorist yes I think the java options for the tests was specific to our setup at the time. Generally sticking to jdk8 is more of a broad stroke workaround. I'm sorry the specific options didn't work for your case. – dlamblin Mar 07 '23 at 00:49
0

simply upgrade to spark 3.3.2 solved my problem

I use Java 17 and pyspark in the command line.

Lynne
  • 454
  • 7
  • 16
  • Just add more info. In my case, for PySpark, I never had this issue no matter which Spark version. For Scala projects, I am on Spark 3.3.2, unfortunately, this does not help, and I still need `--add-exports java.base/sun.nio.ch=ALL-UNNAMED`. – Hongbo Miao Mar 20 '23 at 22:05
0

For those using Gradle to run unit tests for Spark, apply this in build.gradle.kts:

tasks.test {
    useJUnitPlatform()
 
    val sparkJava17CompatibleJvmArgs = listOf(
        "--add-opens=java.base/java.lang=ALL-UNNAMED",
        "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED",
        "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED",
        "--add-opens=java.base/java.io=ALL-UNNAMED",
        "--add-opens=java.base/java.net=ALL-UNNAMED",
        "--add-opens=java.base/java.nio=ALL-UNNAMED",
        "--add-opens=java.base/java.util=ALL-UNNAMED",
        "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED",
        "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED",
        "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
        "--add-opens=java.base/sun.nio.cs=ALL-UNNAMED",
        "--add-opens=java.base/sun.security.action=ALL-UNNAMED",
        "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED",
        "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED",
    )
    jvmArgs = sparkJava17CompatibleJvmArgs
}
chehsunliu
  • 1,559
  • 1
  • 12
  • 22