Running unit tests with Spark 3.3.0 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer

Question

According to the release notes, and specifically the ticket Build and Run Spark on Java 17 (SPARK-33772), Spark now supports running on Java 17.

However, using Java 17 (Temurin-17.0.3+7) with Maven (3.8.6) and maven-surefire-plugin (3.0.0-M7), when running a unit test that uses Spark (3.3.0) it fails with:

java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x1e7ba8d9) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x1e7ba8d9

The stack is:

java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x1e7ba8d9) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x1e7ba8d9
  at org.apache.spark.storage.StorageUtils$.<init>(StorageUtils.scala:213)
  at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala)
  at org.apache.spark.storage.BlockManagerMasterEndpoint.<init>(BlockManagerMasterEndpoint.scala:114)
  at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:353)
  at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:290)
  at org.apache.spark.SparkEnv$.create(SparkEnv.scala:339)
  at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:194)
  at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:279)
  at org.apache.spark.SparkContext.<init>(SparkContext.scala:464)
  at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704)
  at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
  at scala.Option.getOrElse(Option.scala:189)
  at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)
  [...]

The question Java 17 solution for Spark - java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.storage.StorageUtils was asked only 2 months ago, but this pre-dated the Spark 3.3.0 release, and thus predated official support for Java 17.

Why can't I run my Spark 3.3.0 test with Java 17, and how can we fix it?

FWIW, it actually broke for me in 3.2.3 and was fixed in 3.3.1. — combinatorist, Jan 09 '23 at 19:21
That was it works from spark-submit, but not if I try to run it via sbt test in unit (actually acceptance/integration) tests, even in local mode (cluster or client mode). — combinatorist, Feb 02 '23 at 20:58

Greg Kopff · Accepted Answer · 2022-06-23T21:55:33.337

Even though Spark now supports Java 17, it still references the JDK internal class sun.nio.ch.DirectBuffer:

  // In Java 8, the type of DirectBuffer.cleaner() was sun.misc.Cleaner, and it was possible
  // to access the method sun.misc.Cleaner.clean() to invoke it. The type changed to
  // jdk.internal.ref.Cleaner in later JDKs, and the .clean() method is not accessible even with
  // reflection. However sun.misc.Unsafe added a invokeCleaner() method in JDK 9+ and this is
  // still accessible with reflection.
  private val bufferCleaner: DirectBuffer => Unit = [...]

Under the Java module system, access to this class is restricted. The Java 9 migration guide says:

If you must use an internal API that has been made inaccessible by default, then you can break encapsulation using the --add-exports command-line option.

We need to open access to our module. To do this for Surefire, we add this configuration to the plugin:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-surefire-plugin</artifactId>
  <version>3.0.0-M7</version>
  <configuration>
    <argLine>--add-exports java.base/sun.nio.ch=ALL-UNNAMED</argLine>
  </configuration>
</plugin>

Based on a discussion with one of the Spark developers, Spark adds the following in order to execute all of its internal unit tests.

These options are used to pass all Spark UTs, but maybe you don't need all.

--add-opens=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED
--add-opens=java.base/java.io=ALL-UNNAMED
--add-opens=java.base/java.net=ALL-UNNAMED
--add-opens=java.base/java.nio=ALL-UNNAMED
--add-opens=java.base/java.util=ALL-UNNAMED
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED
--add-opens=java.base/sun.nio.cs=ALL-UNNAMED
--add-opens=java.base/sun.security.action=ALL-UNNAMED
--add-opens=java.base/sun.util.calendar=ALL-UNNAMED

It was also commented:

However, these Options needn't explicit add when using spark-shell, spark-sql and spark-submit

Surely those options are overkill. Does someone want to investigate further. I imagine most of options would work with `--add-exports` instead of `--add-opens` (see [docs](https://docs.oracle.com/en/java/javase/17/migrate/migrating-jdk-8-later-jdk-releases.html)), because surely Spark isn't using reflection on all those packages. For a simple use case of reading CSV files and saving to JSON locally, just `--add-exports java.base/sun.nio.ch=ALL-UNNAMED` is working for me. Still we shouldn't have this problem in the first place. Does Spark intend to fix this? — Garret Wilson, Aug 28 '22 at 22:33
@GarretWilson The link to the `user@spark.apache.org` [mailing list discussion](https://lists.apache.org/thread/814cpb1rpp73zkhtv9t4mkzzrznl82yn) is in the answer if you want to take up the discussion with Yang Jie further. — Greg Kopff, Aug 28 '22 at 22:45
This would be for a spark-submit, but can we fix this in scala code/testing (kicked off with sbt)? — combinatorist, Feb 02 '23 at 21:00

Anton Andreev · Answer 2 · 2022-09-21T11:03:30.440

3

Based on the discussions above I am using:

%SPARK_HOME%\bin\spark-submit.cmd --driver-java-options "--add-exports java.base/sun.nio.ch=ALL-UNNAMED" spark_ml_heart.py

with a single --add-exports to run a Python script on Spark 3.2.1 on Java 17.

You may need the full version with all --add-exports:

%SPARK_HOME%\bin\spark-submit.cmd --driver-java-options "--add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED" spark_ml_heart.py

edited Sep 21 '22 at 11:03

answered Sep 21 '22 at 10:37

Anton Andreev

2,052
1
22
23

How can I add the options inside supark-submit.cmd so that I can use pyspark inside my IDE (PyCharm)? – adranale Jan 03 '23 at 15:42
The single adjustment (first option) worked for me. Thanks! – combinatorist Jan 09 '23 at 19:05
To clarify, it worked for me for a spark-submit with spark 3.2, but I'm still not seeing anywhere in this answer how to run scala tests (like with sbt) on a spark project. Even with spark 3.3, I still get the OP's error, so something is configured differently with `sbt test` than a `spark-submit --master=local[*]`. – combinatorist Feb 02 '23 at 21:33

Oscar Drai · Answer 3 · 2023-03-28T17:17:48.613

After fixing some of these errors, I got an error with the KryoSerializer:

java.lang.IllegalArgumentException: Unable to create serializer "com.esotericsoftware.kryo.serializers.FieldSerializer" for class: java.nio.HeapByteBuffer

I got around this issue by adding some of the VM arguments aforementioned by @Greg Kopff to my pom.xml (I am using maven):

    <plugin>
        <groupId>org.scalatest</groupId>
        <artifactId>scalatest-maven-plugin</artifactId>
        <version>${scalatest-maven-plugin.version}</version>
        <configuration>
            <argLine>
                --add-opens=java.base/java.lang.invoke=ALL-UNNAMED
                --add-opens=java.base/java.nio=ALL-UNNAMED
                --add-opens=java.base/java.util=ALL-UNNAMED
                --add-opens=java.base/sun.nio.ch=ALL-UNNAMED
            </argLine>
        </configuration>
    </plugin>

Running unit tests with Spark 3.3.0 on Java 17 fails with IllegalAccessError: class StorageUtils cannot access class sun.nio.ch.DirectBuffer

3 Answers3

Linked