I'm trying to use Spark 3 from a Spring boot 2.7.3 application. I'm working from a Docker compose environment on Windows 10 and Docker Desktop.
Here is my docker compose:
version: '3'
services:
spark-master:
image: bde2020/spark-master:3.3.0-hadoop3.3
container_name: spark-master
ports:
- "8088:8080"
- "7077:7077"
environment:
- INIT_DAEMON_STEP=setup_spark
spark-worker-1:
image: bde2020/spark-worker:3.3.0-hadoop3.3
container_name: spark-worker-1
depends_on:
- spark-master
ports:
- "8081:8081"
environment:
- "SPARK_MASTER=spark://spark-master:7077"
spark-worker-2:
image: bde2020/spark-worker:3.3.0-hadoop3.3
container_name: spark-worker-2
depends_on:
- spark-master
ports:
- "8082:8081"
environment:
- "SPARK_MASTER=spark://spark-master:7077"
My Spring server is on my local windows and is therefore not included in my compose. This is how I configure the connection with Spark:
@Configuration
public class SparkConfig {
@Bean
public JavaSparkContext sparkContext() {
SparkConf sparkConf = new SparkConf()
.setAppName("SparkSpringBootApplication")
.setMaster("spark://localhost:7077");
JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf);
return javaSparkContext;
}
Now when I launch my server I have these logs scrolling in a loop and preventing me from launching any processing.
2023-03-15 13:49:15.433 INFO 15904 --- [rainedScheduler] o.a.s.s.BlockManagerMaster : Removal of executor 323 requested
2023-03-15 13:49:15.433 INFO 15904 --- [ckManagerMaster] o.a.s.s.BlockManagerMasterEndpoint : Trying to remove executor 323 from BlockManagerMaster.
2023-03-15 13:49:15.433 INFO 15904 --- [rainedScheduler] seGrainedSchedulerBackend$DriverEndpoint : Asked to remove non-existent executor 323
2023-03-15 13:49:15.433 INFO 15904 --- [er-event-loop-5] o.a.s.s.c.StandaloneSchedulerBackend : Granted executor ID app-20230315123815-0009/325 on hostPort 172.18.0.5:38035 with 8 core(s), 1024.0 MiB RAM
2023-03-15 13:49:15.455 INFO 15904 --- [er-event-loop-2] s.d.c.StandaloneAppClient$ClientEndpoint : Executor updated: app-20230315123815-0009/325 is now RUNNING
2023-03-15 13:49:19.539 INFO 15904 --- [er-event-loop-1] s.d.c.StandaloneAppClient$ClientEndpoint : Executor updated: app-20230315123815-0009/324 is now EXITED (Command exited with code 1)
2023-03-15 13:49:19.539 INFO 15904 --- [er-event-loop-1] o.a.s.s.c.StandaloneSchedulerBackend : Executor app-20230315123815-0009/324 removed: Command exited with code 1
2023-03-15 13:49:19.540 INFO 15904 --- [er-event-loop-1] s.d.c.StandaloneAppClient$ClientEndpoint : Executor added: app-20230315123815-0009/326 on worker-20230315095819-172.18.0.4-33639 (172.18.0.4:33639) with 8 core(s)
2023-03-15 13:49:19.540 INFO 15904 --- [rainedScheduler] o.a.s.s.BlockManagerMaster : Removal of executor 324 requested
2023-03-15 13:49:19.540 INFO 15904 --- [ckManagerMaster] o.a.s.s.BlockManagerMasterEndpoint : Trying to remove executor 324 from BlockManagerMaster.
2023-03-15 13:49:19.540 INFO 15904 --- [rainedScheduler] seGrainedSchedulerBackend$DriverEndpoint : Asked to remove non-existent executor 324
2023-03-15 13:49:19.540 INFO 15904 --- [er-event-loop-1] o.a.s.s.c.StandaloneSchedulerBackend : Granted executor ID app-20230315123815-0009/326 on hostPort 172.18.0.4:33639 with 8 core(s), 1024.0 MiB RAM
2023-03-15 13:49:19.561 INFO 15904 --- [er-event-loop-0] s.d.c.StandaloneAppClient$ClientEndpoint : Executor updated: app-20230315123815-0009/326 is now RUNNING
2023-03-15 13:49:19.802 INFO 15904 --- [er-event-loop-7] s.d.c.StandaloneAppClient$ClientEndpoint : Executor updated: app-20230315123815-0009/325 is now EXITED (Command exited with code 1)
2023-03-15 13:49:19.802 INFO 15904 --- [er-event-loop-7] o.a.s.s.c.StandaloneSchedulerBackend : Executor app-20230315123815-0009/325 removed: Command exited with code 1
2023-03-15 13:49:19.802 INFO 15904 --- [rainedScheduler] o.a.s.s.BlockManagerMaster : Removal of executor 325 requested
2023-03-15 13:49:19.802 INFO 15904 --- [er-event-loop-5] s.d.c.StandaloneAppClient$ClientEndpoint : Executor added: app-20230315123815-0009/327 on worker-20230315095819-172.18.0.5-38035 (172.18.0.5:38035) with 8 core(s)
2023-03-15 13:49:19.802 INFO 15904 --- [ckManagerMaster] o.a.s.s.BlockManagerMasterEndpoint : Trying to remove executor 325 from BlockManagerMaster.
2023-03-15 13:49:19.802 INFO 15904 --- [rainedScheduler] seGrainedSchedulerBackend$DriverEndpoint : Asked to remove non-existent executor 325
2023-03-15 13:49:19.802 INFO 15904 --- [er-event-loop-5] o.a.s.s.c.StandaloneSchedulerBackend : Granted executor ID app-20230315123815-0009/327 on hostPort 172.18.0.5:38035 with 8 core(s), 1024.0 MiB RAM
2023-03-15 13:49:19.823 INFO 15904 --- [er-event-loop-6] s.d.c.StandaloneAppClient$ClientEndpoint : Executor updated: app-20230315123815-0009/327 is now RUNNING
The only errors I find are at the level of the Spark Worker:
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1894)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:424)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:413)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.$anonfun$run$9(CoarseGrainedExecutorBackend.scala:444)
at scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23)
at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:985)
at scala.collection.immutable.Range.foreach(Range.scala:158)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:984)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.$anonfun$run$7(CoarseGrainedExecutorBackend.scala:442)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:62)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
... 4 more
Caused by: java.io.IOException: Failed to connect to host.docker.internal/192.168.65.2:53999
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:288)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:218)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:230)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:204)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:202)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:198)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: host.docker.internal/192.168.65.2:53999
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:710)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
Seems Spark can't find my Spring server? I don't understand where the error can come from. I specify that I was careful to use the same version of Spark / Hadoop in all my imports.
Any help would be welcome.
Thank you all and have a nice day