0

I am new to Spark, and I keep running into various "module java.base does not export XXX". I keep adding more --add-open options to the JVM. There are a lot of SO posts about these issues.

this post has a pretty long list.

I am now at these options:

--add-opens=java.base/java.nio=ALL-UNNAMED
--add-opens java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED 
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED 
--add-opens=java.base/java.io=ALL-UNNAMED 
--add-opens=java.base/java.net=ALL-UNNAMED 
--add-opens=java.base/java.nio=ALL-UNNAMED 
--add-opens java.base/java.util=ALL-UNNAMED
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED 
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED 
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED 
--add-opens=java.base/sun.nio.cs=ALL-UNNAMED 
--add-opens=java.base/sun.security.action=ALL-UNNAMED 
--add-opens=java.base/sun.util.calendar=ALL-UNNAMED 
--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED

and I don't seem to have any issues. But this is disturbing. These options are not documented AFAIK. You just have to keep figuring out options until you stop getting errors.

So my question is: Which JDK is recommended for Spark? Release notes for 3.4.0 rather sound like Java 8 is on the way to being deprecated. And I would like to use Java 17 because of new language features, and the expectation that my project dependencies will no longer be available in Java 8 some day.

Perhaps a better way to think of this is: Where on Spark's roadmap, if at all, will there no longer be a requirement to add all of these undocumented options? Is there a timeline for JDK 8 support end-of-life?

PS: This is a real pain for Intellij IDEA IDE because these options have to be pasted into every Run Configuration. Kind of a side question, but can these options be put into a global place in IDEA so that all Run Configurations pick it up?

PPS: I am not using Hadoop, does that make simpler options available by somehow excluding that support from Spark?

UPDATE: a colleague told me to put these in a file and use @filepath in the JVM options, makes things somewhat easier.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Wheezil
  • 3,157
  • 1
  • 23
  • 36

1 Answers1

1

Spark 3.4.0 runs on Java 8/11/17, Scala 2.12/2.13, Python 3.7+, and R 3.5+.

Java 8 prior to version 8u362 support is deprecated as of Spark 3.4.0.

Spark 3.4.0 Official documentation

shalnarkftw
  • 402
  • 2
  • 8
  • Yes, I have read that too . But it does not answer my question of "which is recommended". – Wheezil Jun 08 '23 at 13:41
  • 1
    there is no recommandation, it depends on your cluster configurations, tools installed ect.. in the doc they told what are the possibilities and now you need to make the choice and start developing – shalnarkftw Jun 08 '23 at 14:23