6

I'm trying to run spark locally on my Mac. I have this so far:

conf = SparkConf().setAppName('test').setMaster('local[*]')
sc = SparkContext(conf=conf)

I know i do have JAVA_HOME (as /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home) and SPARK_HOME (as /Users/myname/spark-3.1.1-amzn-0-bin-3.2.1-amzn-3, which is where installed spark and where the bin for spark-submit is, which i can run on my terminal as well) correctly. I am getting this:

py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoSuchMethodError: io.netty.util.concurrent.SingleThreadEventExecutor.<init>(Lio/netty/util/concurrent/EventExecutorGroup;Ljava/util/concurrent/Executor;ZLjava/util/Queue;Lio/netty/util/concurrent/RejectedExecutionHandler;)V
    at io.netty.channel.SingleThreadEventLoop.<init>(SingleThreadEventLoop.java:65)
    at io.netty.channel.nio.NioEventLoop.<init>(NioEventLoop.java:138)
    at io.netty.channel.nio.NioEventLoopGroup.newChild(NioEventLoopGroup.java:146)
    at io.netty.channel.nio.NioEventLoopGroup.newChild(NioEventLoopGroup.java:37)
    at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:84)
    at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:58)
    at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:47)
    at io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:59)
    at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:86)
    at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:81)
    at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:68)
    at org.apache.spark.network.util.NettyUtils.createEventLoop(NettyUtils.java:66)
    at org.apache.spark.network.client.TransportClientFactory.<init>(TransportClientFactory.java:106)
    at org.apache.spark.network.TransportContext.createClientFactory(TransportContext.java:142)
    at org.apache.spark.rpc.netty.NettyRpcEnv.<init>(NettyRpcEnv.scala:77)
    at org.apache.spark.rpc.netty.NettyRpcEnvFactory.create(NettyRpcEnv.scala:493)
    at org.apache.spark.rpc.RpcEnv$.create(RpcEnv.scala:57)
    at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266)
    at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:189)
    at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:277)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:458)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:238)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:748)

I attempting to run this from a python 3.8 venv in PyCharm in case that helps.

I have seen similar errors elsewhere but haven't seen any solutions that work for me:

Getting Py4JavaError:Calling None.org.apache.spark.api.java.JavaSparkContext https://superuser.com/questions/1436855/port-binding-error-in-pyspark PYspark SparkContext Error "error occurred while calling None.org.apache.spark.api.java.JavaSparkContext." py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext Spark2-submit failing with pyspark

Thoughts?

rodrigocf
  • 1,951
  • 13
  • 39
  • 62

2 Answers2

4

Try changing the Java version to JDK 8. I faced same error while my JAVA_HOME was mapped to JDK 16. Got it corrected by changing JAVA_HOME to JDK 8.

Gyan Rath
  • 49
  • 1
2

I faced a similar problem when attempting to use the amazon/aws-glue-libs:glue_libs_2.0.0_image_01 image. In my case the problem was resolved by re-installing pyspark.

The java.lang.NoSuchMethodError hints at various dependency versions being used which causes this error to be thrown.

If you are interested in an image with AWS Glue, take a look at the Dockerfile provided by: https://github.com/DNXLabs/docker-glue-libs/blob/master/Dockerfile but it is just as easy to remove the glue-libs from the Dockerfile. The image can then be used as a remote interpreter in PyCharm (see for example: https://aws.amazon.com/blogs/big-data/developing-aws-glue-etl-jobs-locally-using-a-container/)

edit: In my case the problem was caused by the following JARs that were loaded when using the aws-glue-libs:

  • aws-glue-libs/jarsv1/javax.servlet-3.*
  • aws-glue-libs/jarsv1/netty-*

Remove these and the problem is gone

Emptyless
  • 2,964
  • 3
  • 20
  • 30